Image processing method

ABSTRACT

A method and apparatus for image processing in which data for image processing are imprinted into the images to be processed. An image input unit receives a first image and a second image for encoding. A matching processor performs a pixel-by-pixel matching between the images and transmits a corresponding point file to an imprinting unit. The corresponding point file and a program for processing the images and the corresponding point file are imprinted into the first image and an altered first image is generated. In decoding, an extracting unit extracts the corresponding point file from the altered first image and an intermediate image generator utilizes the program to generate intermediate images from the first image, the second image and the corresponding point file.

BACKGROUND OF THE INVENTION

[0001] 1. Field of the Invention

[0002] The present invention relates to image processing techniques, andmore particularly relates to techniques of encoding and decoding imagesfor transmission or storage.

[0003] 2. Description of the Related Art

[0004] Recently, image processing and compression methods such as thoseproposed by MPEG (Motion Picture Expert Group) have expanded to be usedwith transmission media such as network and broadcast rather than juststorage media such as CDs. Generally speaking, the success of thedigitization of broadcast materials has been caused at least in part bythe availability of MPEG compression coding technology. In this way, abarrier that previously existed between broadcast and other types ofcommunication has begun to disappear, leading to a diversification ofservice-providing businesses. Thus, we are facing a situation where itis hard to predict how the digital culture would evolve in this age ofbroadband.

[0005] Even in such a chaotic situation, it is clear that the directionof the compression technology of motion pictures will be to move to bothhigher compression rates and better image quality. It is a well-knownfact that block distortion in MPEG compression is sometimes responsiblefor causing degraded image quality and preventing the compression ratefrom being improved.

SUMMARY OF THE INVENTION

[0006] The present invention has been developed in view of the abovesituation and is intended to provide encoding and decoding techniquesfor the efficient compression of image data. Another object of thepresent invention is to provide encoding and decoding techniques toattempt to meet two opposite requests: (1) to keep good quality ofimages and (2) to achieve a higher rate of compression.

[0007] An embodiment of the present invention relates to an imageprocessing technology. This technology may utilize the image matchingtechnology (hereinafter referred to as the “base technology”) which wasproposed in Japanese patent No.2927350, U.S. Pat. No. 6,018,592 and U.S.Pat. No. 6,137,910 assigned to the same assignee.

[0008] An image processing method according to an embodiment of thepresent invention comprises: acquiring images; and imprinting datautilized for processing the images into the images. In a particularcase, the “data utilized for processing” may be data that is used fordecoding the images when the images are initially encoded. In this case,the present invention can be considered as an image encoding technology.In another particular case, the “data utilized for processing” may bedata which instructs a processing of the images, such as, for example,“Display the images after decompression”. Still another particular caseinvolves data or parameters that are used for image processing. Forexample, the parameters may be parallax data of each point or each pixelon the images such that a pseudo three-dimensional image of the imagescan be displayed based on the data.

[0009] Various known technologies describe imprinting copyrightinformation on an image as a watermark in a visible or invisible manner,however, in these techniques, the information or data imprinted is notused for the processing which is performed on the image. According tothis embodiment of the present invention, desired processing can beincluded with and performed on the images because data regarding theimage processing is imprinted. The security of the data is enhanced indistributing or reproducing the images because the processing data orcontent can be concealed easily if the data are imprinted in an“invisible” manner. Further, in many cases, a reproducing device whichdoes not know the existence of the data can reproduce at least a part ofthe images because the image frames themselves are distributed. Backwardcompatibility is, therefore, sufficiently provided.

[0010] Another embodiment of the present invention comprises: acquiringa first image and a second image; computing a matching between theacquired first and second images; and imprinting the information ofcorresponding points acquired as a result of the matching into at leastone of the first and second images. The information may be imprintedinto only one of the first and second images, may be imprinted into bothof the images separately, or may be distributed between the two images.Further, the information of the corresponding points between the firstimage and the second image may be imprinted into the first image and theinformation of corresponding points between the second image and a thirdimage may be imprinted into the second image and generally theinformation of corresponding points between an n-th image and an n+1-thimage may be imprinted into the n-th image. Besides this, theinformation of corresponding points between any combination of theimages may be imprinted into any image, and it is especially expedientthat the information is imprinted into a data structure that is closedas a motion picture when the images form a motion picture stream.

[0011] Another embodiment of the present invention relates to an imageprocessing apparatus. This apparatus comprises an image input unit whichacquires images and an imprinting unit which imprints data utilized forthe processing which is performed on the acquired images, decoding forexample, into the images. Still another embodiment comprises an imageinput unit which acquires a first image and a second image, a matchingprocessor which computes a matching between the acquired first andsecond images, and an imprinting unit which imprints the information ofthe corresponding points acquired as a result of the matching into atleast one of the first and second images, or imprints into the motionpicture stream which comprises those images.

[0012] In particular, the matching processor may detect points on thesecond image which correspond to lattice points of a mesh set on thefirst image and a destination polygon which corresponds to a sourcepolygon that constitutes the mesh on the first image may be defined onthe second image based on the result of detection.

[0013] Matching methods utilizing critical points may be an applicationof the base technology. The base technology, however, does not touch onprocessing regarding the lattice points or the polygons determinedthereby. The introduction (below) of a technique making use of a meshand polygons makes possible a significant reduction in the size of afile which describes correspondence relation of points between the firstimage and the second image (herein referred to as a “corresponding pointfile”).

[0014] Namely, in a case where the first and second images have n×mpixels respectively, there are (n×m)² combinations if pixel-by-pixelcorrespondence is described as is, such that the size of thecorresponding point file may become extremely large. However, instead,this correspondence is modified by describing the correspondencerelation between the lattice points or, substantially equivalently, thecorrespondence relation between polygons determined by the latticepoints, so that the data amount is reduced significantly. Motionpictures can be reproduced by having only a first image (key frame) or asecond image (key frame) and the corresponding point file, withintermediate images (frames) between the first and second images (keyframes) discarded, and this method realizes efficient transmission orstoring of motion pictures.

[0015] Still another embodiment of the present invention relates to amethod utilized in reproducing the images. The method comprises:acquiring the images; and extracting data utilized for the processingwhich is performed on the images, such as decoding for example, from theimages. This method may further comprise: performing the interpolationof the images based on the data; and outputting, for example storing,transmitting or displaying, the motion pictures acquired as the resultof the interpolation. This embodiment can be, therefore, considered asan image decoding method.

[0016] Yet another embodiment of the present invention relates to anapparatus utilized in reproducing images. This apparatus comprises animage input unit which acquires the images, and an extracting unit whichextracts the data from the images which is utilized for the processingperformed on the images. The apparatus may further comprise anintermediate image generator which performs the interpolation of theimages based on the extracted data, and an output unit which outputs themotion pictures acquired as the result of the interpolation.

[0017] Yet another embodiment of the present invention relates to animage processing method and it particularly relates to an image encodingmethod. This method may comprise: acquiring a first image and a secondimage as key frames, which are respectively a predetermined distancefrom each other; computing a matching between the acquired first andsecond images; compressing the first and second images in an intraframeformat; imprinting the information of the corresponding points acquiredas the result of the matching into a predetermined image in the motionpicture stream which comprises the compressed first and second images;generating a coded motion picture stream which comprises at least thefirst and second images and the predetermined image as the key framesafter imprinting; and outputting the coded motion picture stream whichis generated.

[0018] In this case, as another embodiment, the information or data ofthe corresponding points acquired as the result of the matching may beimprinted into at least one of the first image and the second image inimprinting the information into the predetermined image in the motionpicture stream, and only the compressed first and second images may becomprised in the motion picture stream as the key frames in generatingthe coded motion picture stream.

[0019] In the present description the phrase “compressing in anintraframe format” is intended to mean compressing an image in such aformat that decompression processing can be performed by referringsolely within the image frame. Various formats of this type are know,for example, the compression of still pictures using the JPEG (jointphotographic experts group) format.

[0020] An image decoding method according to this embodiment maycomprise: decompressing the first and the second images and so forth inthe intraframe format after acquiring those images; extracting theinformation or data of the corresponding points from the first image orthe second image or the like into which the information or data has beenimprinted; generating intermediate images from the information or dataof the corresponding points and the first and second images, which arethe key frames, by computing interpolation; and outputting, for examplestoring or displaying, the generated intermediate images and the firstand second images in order.

[0021] Another embodiment of the image processing method according tothe present invention comprises: acquiring the images; imprinting datautilized for performing the processing on the images (hereinafterreferred to as “target data”) into the images; and distributing anelectronic key to a user for extracting the target data. Here, the useris generally a user who acquires the images after the imprinting. The“electronic key” may be an appropriate electronic or digital key as isknown or becomes known in the art. In particular, the key may besubstantially comprised of data or a program and may, for example,comprise the following various elements and combinations thereof and isutilized at a decoding side:

[0022] 1) A key which decodes the target data when the data are encoded.

[0023] 2) A key which extracts the target data by performing aprocessing reverse to the imprinting process of the data.

[0024] 3) A key which authenticates the user.

[0025] 4) A key which decodes the entire images when the imagesincluding the target data are encoded.

[0026] The information or data of the corresponding points isillustrated as just one example of the target data. These variations andother variations regarding the key are also effective throughout thisspecification.

[0027] Yet another embodiment of the present invention relates to animage processing method. This method comprises: acquiring the images;and imprinting a program for reproducing or decoding the images into theimages. For “reproducing”, for example, the program may be a so-calledviewer or an image player. For “decoding”, for example, an imageprocessing program may be provided which converts discrete image framesinto continuous motion pictures by interpolation or other processingwhen the images comprise discrete image frames. Processing whichgenerates intermediate frames between key frames based on theinformation of corresponding points between the key frames can beconsidered as an example of this type of processing and is described inthe “base technology” below.

[0028] Yet another embodiment of the present invention further relatesto an image processing method. This method comprises: acquiring firstand the second images; computing a matching between the acquired firstand second images; imprinting information of corresponding pointsacquired as the result of the matching into at least one of the firstand second images; and imprinting a program into at least one of thefirst and second images, which is utilized for generating anintermediate image of the first and second images based on the imprintedinformation of the corresponding points. In “imprinting”, it is usefulto note that the information of the corresponding points may beimprinted into any image included in a motion picture stream whichcomprises the first and the second images.

[0029] Yet another embodiment of the present invention relates to animage processing method. This method is mainly utilized in decoding andcomprises: acquiring images; and extracting a program for reproducing ordecoding the images, which is imprinted into the acquired images. Themethod may further comprise: 1) acquiring an electronic key forextracting the program from the images; 2) extracting information ofcorresponding points imprinted into the images in addition to theprogram; 3) generating motion pictures based on the images by executingthe program; and so forth.

[0030] Yet another embodiment of the present invention further relatesto an image processing method. This method comprises: acquiring firstand second images as the key frames, which respectively keep apredetermined distance between each other; computing a matching betweenthe acquired first and second images; compressing the first and secondimages in an intraframe format; imprinting a program into at least oneof the compressed first and second images, which generates intermediateimages of the first and the second images utilizing the result of thematching; generating a coded motion picture stream which includes atleast the compressed first and second images as the key frames; andoutputting the coded motion picture stream which is generated. Theprogram may be imprinted into a predetermined image in the motionpicture stream which comprises the compressed first and second images.

[0031] Yet another embodiment of the present invention relates to acontent storing method. This method comprises: acquiring a digitalcontent; and imprinting a program for reproducing or decoding thecontent into the content. In particular, the content may haveparticularity in relation to the program in that the entire content canbe reproduced by utilizing the program, even though the content isstored in a generalized format in which the content can be partlyreproduced without the program. Alternatively, the content may also haveparticularity in relation to the program in that the content can bereproduced with high quality utilizing the program, even though thecontent is stored in a generalized format in which the content can bereproduced with low quality without the program.

[0032] It is to be noted that the base technology is not a requirementof the present invention. Further it is also possible to havereplacement or substitution of the above-described structural componentsand elements of methods in part or whole as between method and apparatusor to add elements to either method or apparatus and also, theapparatuses and methods may be implemented by a computer program andsaved on a recording medium or the like and are all effective as andencompassed by the present invention.

[0033] Moreover, this summary of the invention includes features thatmay not be necessary features such that an embodiment of the presentinvention may also be a sub-combination of these described features.

BRIEF DESCRIPTION OF THE DRAWINGS

[0034]FIG. 1(a) is an image obtained as a result of the application ofan averaging filter to a human facial image.

[0035]FIG. 1(b) is an image obtained as a result of the application ofan averaging filter to another human facial image.

[0036]FIG. 1(c) is an image of a human face at p^((5,0)) obtained in apreferred embodiment in the base technology.

[0037]FIG. 1(d) is another image of a human face at p^((5,0)) obtainedin a preferred embodiment in the base technology.

[0038]FIG. 1(e) is an image of a human face at p^((5,1)) obtained in apreferred embodiment in the base technology.

[0039]FIG. 1(f) is another image of a human face at p^((5,1)) obtainedin a preferred embodiment in the base technology.

[0040]FIG. 1(g) is an image of a human face at p^((5,2)) obtained in apreferred embodiment in the base technology.

[0041]FIG. 1(h) is another image of a human face at p^((5,2)) obtainedin a preferred embodiment in the base technology.

[0042]FIG. 1(i) is an image of a human face at p^((5,3)) obtained in apreferred embodiment in the base technology.

[0043]FIG. 1(j) is another image of a human face at p^((5,3)) obtainedin a preferred embodiment in the base technology.

[0044]FIG. 2(R) shows an original quadrilateral.

[0045]FIG. 2(A) shows an inherited quadrilateral.

[0046]FIG. 2(B) shows an inherited quadrilateral.

[0047]FIG. 2(C) shows an inherited quadrilateral.

[0048]FIG. 2(D) shows an inherited quadrilateral.

[0049]FIG. 2(E) shows an inherited quadrilateral.

[0050]FIG. 3 is a diagram showing the relationship between a sourceimage and a destination image and that between the m-th level and the(m−1)th level, using a quadrilateral.

[0051]FIG. 4 shows the relationship between a parameters η (representedby x-axis) and energy C_(f) (represented by y-axis)

[0052]FIG. 5(a) is a diagram illustrating determination of whether ornot the mapping for a certain point satisfies the bijectivity conditionthrough the outer product computation.

[0053]FIG. 5(b) is a diagram illustrating determination of whether ornot the mapping for a certain point satisfies the bijectivity conditionthrough the outer product computation.

[0054]FIG. 6 is a flowchart of the entire procedure of a preferredembodiment in the base technology.

[0055]FIG. 7 is a flowchart showing the details of the process at S1 inFIG. 6.

[0056]FIG. 8 is a flowchart showing the details of the process at S10 inFIG. 7.

[0057]FIG. 9 is a diagram showing correspondence between partial imagesof the m-th and (m−1)th levels of resolution.

[0058]FIG. 10 is a diagram showing source images generated in theembodiment in the base technology.

[0059]FIG. 11 is a flowchart of a preparation procedure for S2 in FIG.6.

[0060]FIG. 12 is a flowchart showing the details of the process at S2 inFIG. 6.

[0061]FIG. 13 is a diagram showing the way a submapping is determined atthe 0-th level.

[0062]FIG. 14 is a diagram showing the way a submapping is determined atthe first level.

[0063]FIG. 15 is a flowchart showing the details of the process at S21in FIG. 6.

[0064]FIG. 16 is a graph showing the behavior of energy C_(f) ^((m,s))corresponding to f^((m,s))(λ=iΔλ) which has been obtained for a certainf^((m,s)) while changing λ.

[0065]FIG. 17 is a diagram showing the behavior of energy C_(f) ^((n))corresponding to f^((n))(η=iΔη)(i=0,1, . . . ) which has been obtainedwhile changing η.

[0066]FIG. 18 shows how pixels correspond between a first image and asecond image.

[0067]FIG. 19 shows a correspondence relation between a source polygontaken on the first image and a destination polygon taken on the secondimage.

[0068]FIG. 20 shows a procedure by which to obtain the points in thedestination polygon corresponding to the points in the source polygon.

[0069]FIG. 21 is a flowchart of a procedure for generating andimprinting a corresponding point file according to an embodiment of thepresent invention.

[0070]FIG. 22 is a flowchart showing a procedure for generating anintermediate image by extracting the corresponding point file accordingto an embodiment of the present invention.

[0071]FIG. 23 shows a functional block structure of an image processingapparatus according to an embodiment of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

[0072] The invention will now be described based on the preferredembodiments, which do not intend to limit the scope of the presentinvention, but exemplify the invention. All of the features and thecombinations thereof described in the embodiment are not necessarilyessential to the invention.

[0073] First, the multiresolutional critical point filter technology andthe image matching processing using the technology, both of which willbe utilized in the preferred embodiments, will be described in detail as“Base Technology”. Namely, the following sections [1] and [2] (below)belong to the base technology, where section [1] describes elementaltechniques and section [2] describes a processing procedure. Thesetechniques are patented under Japanese Patent No. 2927350 and owned bythe same assignee of the present invention. However, it is to be notedthat the image matching techniques provided in the present embodimentsare not limited to the same levels. In particular, in FIGS. 18 to 25,image data coding and decoding techniques, utilizing, in part, the basetechnology, will be described in more detail.

[0074] Base Technology

[0075] [1] Detailed Description of Elemental Techniques

[0076] [1.1] Introduction

[0077] Using a set of new multiresolutional filters called criticalpoint filters, image matching is accurately computed. There is no needfor any prior knowledge concerning the content of the images or objectsin question. The matching of the images is computed at each resolutionwhile proceeding through the resolution hierarchy. The resolutionhierarchy proceeds from a coarse level to a fine level. Parametersnecessary for the computation are set completely automatically bydynamical computation analogous to human visual systems. Thus, There isno need to manually specify the correspondence of points between theimages.

[0078] The base technology can be applied to, for instance, completelyautomated morphing, object recognition, stereo photogrammetry, volumerendering, and smooth generation of motion images from a small number offrames. When applied to morphing, given images can be automaticallytransformed. When applied to volume rendering, intermediate imagesbetween cross sections can be accurately reconstructed, even when adistance between cross sections is rather large and the cross sectionsvary widely in shape.

[0079] [1.2] The Hierarchy of the Critical Point Filters

[0080] The multiresolutional filters according to the base technologypreserve the intensity and location of each critical point included inthe images while reducing the resolution. Initially, let the width of animage to be examined be N and the height of the image be M. Forsimplicity, assume that N=M=2n where n is a positive integer. Aninterval [0, N]⊂R is denoted by I. A pixel of the image at position (i,j) is denoted by p^((i,j)) where i,j∈I.

[0081] Here, a multiresolutional hierarchy is introduced. Hierarchizedimage groups are produced by a multiresolutional filter. Themultiresolutional filter carries out a two dimensional search on anoriginal image and detects critical points therefrom. Themultiresolutinal filter then extracts the critical points from theoriginal image to construct another image having a lower resolution.Here, the size of each of the respective images of the m-th level isdenoted as 2^(m)×2^(m)(0≦m≦n). A critical point filter constructs thefollowing four new hierarchical images recursively, in the directiondescending from n. $\begin{matrix}{{p_{({i,j})}^{({m,0})} = {\min \left( {{\min \left( {p_{({{2i},{2j}})}^{({{m + 1},0})},p_{({{2i},{{2j} + 1}})}^{({{m + 1},0})}} \right)},{\min \left( {p_{({{{2i} + 1},{2j}})}^{({{m + 1},0})},p_{({{{2i} + 1},{{2j} + 1}})}^{({{m + 1},0})}} \right)}} \right)}}{p_{({i,j})}^{({m,1})} = {\max \left( {{\min \left( {p_{({{2i},{2j}})}^{({{m + 1},1})},p_{({{2i},{{2j} + 1}})}^{({{m + 1},1})}} \right)},{\min \left( {p_{({{{2i} + 1},{2j}})}^{({{m + 1},1})},p_{({{{2i} + 1},{{2j} + 1}})}^{({{m + 1},1})}} \right)}} \right)}}{p_{({i,j})}^{({m,2})} = {\min \left( {{\max \left( {p_{({{2i},{2j}})}^{({{m + 1},2})},p_{({{2i},{{2j} + 1}})}^{({{m + 1},2})}} \right)},{\max \left( {p_{({{{2i} + 1},{2j}})}^{({{m + 1},2})},p_{({{{2i} + 1},{{2j} + 1}})}^{({{m + 1},2})}} \right)}} \right)}}{p_{({i,j})}^{({m,3})} = {\max \left( {{\max \left( {p_{({{2i},{2j}})}^{({{m + 1},3})},p_{({{2i},{{2j} + 1}})}^{({{m + 1},3})}} \right)},{\max \left( {p_{({{{2i} + 1},{2j}})}^{({{m + 1},3})},p_{({{{2i} + 1},{{2j} + 1}})}^{({{m + 1},3})}} \right)}} \right)}}} & (1)\end{matrix}$

[0082] where we let $\begin{matrix}{p_{({i,j})}^{({n,0})} = {p_{({i,j})}^{({n,1})} = {p_{({i,j})}^{({n,2})} = {p_{({i,j})}^{({n,3})} = p_{({i,j})}}}}} & (2)\end{matrix}$

[0083] The above four images are referred to as subimages hereinafter.When min_(x≦t≦x+1) and max_(x≦t≦x+1) are abbreviated to a and β,respectively, the subimages can be expressed as follows:P^((m, 0)) = α(x)α(y)p^((m + 1, 0))P^((m, 1)) = α(x)β(y)p^((m + 1, 1))P^((m, 2)) = β(x)α(y)p^((m + 1, 2))P^((m, 2)) = β(x)β(y)p^((m + 1, 3))

[0084] Namely, they can be considered analogous to the tensor productsof α and β. The subimages correspond to the respective critical points.As is apparent from the above equations, the critical point filterdetects a critical point of the original image for every blockconsisting of 2×2 pixels. In this detection, a point having a maximumpixel value and a point having a minimum pixel value are searched withrespect to two directions, namely, vertical and horizontal directions,in each block. Although pixel intensity is used as a pixel value in thisbase technology, various other values relating to the image may be used.A pixel having the maximum pixel values for the two directions, onehaving minimum pixel values for the two directions, and one having aminimum pixel value for one direction and a maximum pixel value for theother direction are detected as a local maximum point, a local minimumpoint, and a saddle point, respectively.

[0085] By using the critical point filter, an image (1 pixel here) of acritical point detected inside each of the respective blocks serves torepresent its block image (4 pixels here) in the next lower resolutionlevel. Thus, the resolution of the image is reduced. From a singularitytheoretical point of view, α(x)α(y) preserves the local minimum point(minima point), β(x)β(y) preserves the local maximum point (maximapoint), α(x)β(y) and β(x)α(y) preserve the saddle points.

[0086] At the beginning, a critical point filtering process is appliedseparately to a source image and a destination image which are to bematching-computed. Thus, a series of image groups, namely, sourcehierarchical images and destination hierarchical images are generated.Four source hierarchical images and four destination hierarchical imagesare generated corresponding to the types of the critical points.

[0087] Thereafter, the source hierarchical images and the destinationhierarchical images are matched in a series of resolution levels. First,the minima points are matched using p^((m,0)). Next, the first saddlepoints are matched using p^((m,1)) based on the previous matching resultfor the minima points. The second saddle points are matched usingp^((m,2)). Finally, the maxima points are matched using p^((m,3)).

[0088]FIGS. 1c and 1 d show the subimages p^((5,0)) of the images inFIGS. 1a and 1 b, respectively. Similarly, FIGS. 1e and 1 f show thesubimages p^((5,1)), FIGS. 1g and 1 h show the subimages p^((5,2)), andFIGS. 1i and 1 j show the subimages p^((5,3)). Characteristic parts inthe images can be easily matched using subimages. The eyes can bematched by p^((5,0)) since the eyes are the minima points of pixelintensity in a face. The mouths can be matched by p^((5,1)) since themouths have low intensity in the horizontal direction. Vertical lines onboth sides of the necks become clear by p^((5,2)). The ears and brightparts of the cheeks become clear by p^((5,3)) since these are the maximapoints of pixel intensity.

[0089] As described above, the characteristics of an image can beextracted by the critical point filter. Thus, by comparing, for example,the characteristics of an image shot by a camera with thecharacteristics of several objects recorded in advance, an object shotby the camera can be identified.

[0090] [1.3] Computation of Mapping Between Images

[0091] Now, for matching images, a pixel of the source image at thelocation (i,j) is denoted by p_((i,j)) ^((n)) and that of thedestination image at (k,l) is denoted by q_((k,l)) ^((n)) where i, j, k,l∈I. The energy of the mapping between the images (described later inmore detail) is then defined. This energy is determined by thedifference in the intensity of the pixel of the source image and itscorresponding pixel of the destination image and the smoothness of themapping. First, the mapping f^((m,0)):p^((m,0))→q^((m,0)) betweenp^((m,0)) and q^((m,0)) with the minimum energy is computed. Based onf^((m,0)), the mapping f^((m,1)) between p^((m,1)) and q^((m,1)) withthe minimum energy is computed. This process continues until f^((m,3))between p^((m,3)) and q^((m,3)) is computed. Each f^((m,i))(i=0,1,2, . .. ) is referred to as a submapping. The order of i will be rearranged asshown in the following equation (3) in computing f^((m,i)) for reasonsto be described later. $\begin{matrix}{f^{({m,i})}:\left. p^{({m,{\sigma {(i)}}})}\rightarrow q^{({m,{\sigma {(i)}}})} \right.} & (3)\end{matrix}$

[0092] where σ(i)∈{0,1,2,3}.

[0093] [1.3.1] Bijectivity

[0094] When the matching between a source image and a destination imageis expressed by means of a mapping, that mapping shall satisfy theBijectivity Conditions (BC) between the two images (note that aone-to-one surjective mapping is called a bijection). This is becausethe respective images should be connected satisfying both surjection andinjection, and there is no conceptual supremacy existing between theseimages. It is to be noted that the mappings to be constructed here arethe digital version of the bijection. In the base technology, a pixel isspecified by a co-ordinate point.

[0095] The mapping of the source subimage (a subimage of a source image)to the destination subimage (a subimage of a destination image) isrepresented by f^((m,s)): I/2^(n−m)×I/2^(n−m)→I/2^(n−m)×I/2^(n−m)(s=0,1,. . . ) where f_((i,j)) ^((m,s))=(k,l) means that p_((i,j)) ^((m,s)) ofthe source image is mapped to q_((k,l)) ^((m,s)) of the destinationimage. For simplicity, when f(i,j)=(k,l) holds, a pixel q_((k,l)) isdenoted by q_(f(i,j)).

[0096] When the data sets are discrete as image pixels (grid points)treated in the base technology, the definition of bijectivity isimportant. Here, the bijection will be defined in the following manner,where i, j, k and l are all integers. First, a square region R definedon the source image plane is considered $\begin{matrix}{p_{({i,j})}^{({m,s})}p_{({{i + 1},j})}^{({m,s})}p_{({{i + 1},{j + 1}})}^{({m,s})}p_{({i,{j + 1}})}^{({m,s})}} & (4)\end{matrix}$

[0097] where i=0, . . . , 2^(m)−1, and j=0, . . . , 2^(m)−1. The edgesof R are directed as follows: $\begin{matrix}{\overset{\rightarrow}{p_{({i,j})}^{({m,s})}p_{({{i + 1},j})}^{({m,s})}},\overset{\rightarrow}{p_{({{i + 1},j})}^{({m,s})}p_{({{i + 1},{j + 1}})}^{({m,s})}},{\overset{\rightarrow}{p_{({{i + 1},{j + 1}})}^{({m,s})}p_{({i,{j + 1}})}^{({m,s})}}\quad {and}\quad \overset{\rightarrow}{p_{({i,{j + 1}})}^{({m,s})}p_{({i,j})}^{({m,s})}}}} & (5)\end{matrix}$

[0098] This square region R will be mapped by f to a quadrilateral onthe destination image plane: $\begin{matrix}{q_{f{({i,j})}}^{({m,s})}q_{f{({{i + 1},j})}}^{({m,s})}q_{f{({{i + 1},{j + 1}})}}^{({m,s})}q_{f{({i,{j + 1}})}}^{({m,s})}} & (6)\end{matrix}$

[0099] This mapping f^((m,s))(R), that is,f^((m, s))(R) = f^((m, s))(p_((i, j))^((m, s))p_((i + 1, j))^((m, s))p_((i + 1, j + 1))^((m, s))p_((i, j + 1))^((m, s))) = q_(f(i, j))^((m, s))q_(f(i + 1, j))^((m, s))q_(f(i + 1, j + 1))^((m, s))q_(f(i, j + 1))^((m, s)))

[0100] should satisfy the following bijectivity conditions(referred toas BC hereinafter):

[0101] 1. The edges of the quadrilateral f^((m,s))(R) should notintersect one another.

[0102] 2. The orientation of the edges of f^((m,s))(R) should be thesame as that of R (clockwise in the case shown in FIG. 2, describedbelow).

[0103] 3. As a relaxed condition, a retraction mapping is allowed.

[0104] Without a certain type of a relaxed condition as in, for example,condition 3 above, there would be no mappings which completely satisfythe BC other than a trivial identity mapping. Here, the length of asingle edge of f^((m,s))(R) may be zero. Namely, f^((m,s))(R) may be atriangle. However, f^((m,s))(R) is not allowed to be a point or a linesegment having area zero. Specifically speaking, if FIG. 2R is theoriginal quadrilateral, FIGS. 2A and 2D satisfy the BC while FIGS. 2B,2C and 2E do not satisfy the BC.

[0105] In actual implementation, the following condition may be furtherimposed to easily guarantee that the mapping is surjective. Namely, eachpixel on the boundary of the source image is mapped to the pixel thatoccupies the same location at the destination image. In other words,f(i,j)=(i,j) (on the four lines of i=0,i=2^(m)−1, j=0,j=2^(m)−1). Thiscondition will be hereinafter referred to as an additional condition.

[0106] [1.3.2] Energy of Mapping

[0107] [1.3.2.1] Cost Related to the Pixel Intensity

[0108] The energy of the mapping f is defined. An objective here is tosearch a mapping whose energy becomes minimum. The energy is determinedmainly by the difference in the intensity between the pixel of thesource image and its corresponding pixel of the destination image.Namely, the energy C_((i,j)) ^((m,s)) of the mapping f^((m,s)) at (i,j)is determined by the following equation (7). $\begin{matrix}{C_{({i,j})}^{({m,s})} = {{{V\left( p_{({i,j})}^{({m,s})} \right)} - {V\left( q_{f{({i,j})}}^{({m,s})} \right)}}}^{2}} & (7)\end{matrix}$

[0109] where V(p_((i,j)) ^((m,s))) and v(q_(f(i,j)) ^((m,s))) are theintensity values of the pixels p_((i,j)) ^((m,s)) and q_(f(i,j))^((m,s)), respectively. The total energy C^((m,s)) of f is a matchingevaluation equation, and can be defined as the sum of C_((i,j)) ^((m,s))as shown in the following equation (8). $\begin{matrix}{C_{f}^{({m,s})} = {\sum\limits_{i = 0}^{i = {2^{m} - 1}}\quad {\sum\limits_{j = 0}^{j = {2^{m} - 1}}\quad C_{({i,j})}^{({m,s})}}}} & (8)\end{matrix}$

[0110] [1.3.2.2] Cost Related to the Locations of the Pixel for SmoothMapping

[0111] In order to obtain smooth mappings, another energy D_(f) for themapping is introduced. The energy Df is determined by the locations ofp_((i,j)) ^((m,s)) and q_(f(i,j)) ^((m,s))(i=0,1, . . . , 2^(m)−1,j=0,1, . . . , 2^(m)−1), regardless of the intensity of the pixels. Theenergy D_((i,j)) ^((m,s)) of the mapping f^((m,s)) at a point (i,j) isdetermined by the following equation (9). $\begin{matrix}{D_{({i,j})}^{({m,s})} = {{\eta \quad E_{0{({i,j})}}^{({m,s})}} + E_{1{({i,j})}}^{({m,s})}}} & (9)\end{matrix}$

[0112] where the coefficient parameter η which is equal to or greaterthan 0 is a real number. And we have $\begin{matrix}{{E_{0{({i,j})}}^{({m,s})} = {{\left( {i,j} \right) - {f^{({m,s})}\left( {i,j} \right)}}}^{2}}\quad} & (10) \\{E_{1{({i,j})}}^{({m,s})} = {\sum\limits_{i^{\prime} = {i - 1}}^{i}\quad {\sum\limits_{j^{\prime} = {j - 1}}^{j}\quad {{{\left( {{f^{({m,s})}\left( {i,j} \right)} - \left( {i,j} \right)} \right) - \left( {{f^{({m,s})}\left( {i^{\prime},j^{\prime}} \right)} - \left( {i^{\prime},j^{\prime}} \right)} \right)}}^{2}/4}}}} & (11)\end{matrix}$

[0113] where

∥(x,y)∥={square root}{square root over (x ² +y ²)}  (12)

[0114] i′ and j′ are integers and f(i′,j′) is defined to be zero fori′<0 and j′<0. E₀ is determined by the distance between (i,j) andf(i,j). E₀ prevents a pixel from being mapped to a pixel too far awayfrom it. However, as explained below, E₀ can be replaced by anotherenergy function. E₁ ensures the smoothness of the mapping. E₁ representsa distance between the displacement of p(i,j) and the displacement ofits neighboring points. Based on the above consideration, anotherevaluation equation for evaluating the matching, or the energy D_(f) isdetermined by the following equation: $\begin{matrix}{D_{f}^{({m,s})} = {\sum\limits_{i = 0}^{i = {2^{m} - 1}}\quad {\sum\limits_{j = 0}^{j = {2^{m} - 1}}\quad D_{({i,j})}^{({m,s})}}}} & (13)\end{matrix}$

[0115] [1.3.2.3] Total Energy of the Mapping

[0116] The total energy of the mapping, that is, a combined evaluationequation which relates to the combination of a plurality of evaluations,is defined as λC_(f) ^((m,s))+D_(f) ^((m,s)), where λ≧0 is a realnumber. The goal is to detect a state in which the combined evaluationequation has an extreme value, namely, to find a mapping which gives theminimum energy expressed by the following: $\begin{matrix}{\min\limits_{f}\left\{ {{\lambda \quad C_{f}^{({m,s})}} + D_{f}^{({m,s})}} \right\}} & (14)\end{matrix}$

[0117] Care must be exercised in that the mapping becomes an identitymapping if λ=0 and η=0(i.e., f^((m,s))(i,j)=(i,j) for all i=0,1, . . . ,2^(m)−1 and j=0,1, . . . , 2^(m)−1). As will be described later, themapping can be gradually modified or transformed from an identitymapping since the case of λ=0 and η=0 is evaluated at the outset in thebase technology. If the combined evaluation equation is defined as C_(f)^((m,s))+λD_(f) ^((m,s)) where the original position of λ is changed assuch, the equation with λ=0 and η=0 will be C_(f) ^((m,s)) only. As aresult thereof, pixels would randomly matched to each other only becausetheir pixel intensities are close, thus making the mapping totallymeaningless. Transforming the mapping based on such a meaninglessmapping makes no sense. Thus, the coefficient parameter is so determinedthat the identity mapping is initially selected for the evaluation asthe best mapping.

[0118] Similar to this base technology, differences in the pixelintensity and smoothness are considered in a technique called “opticalflow” that is known in the art. However, the optical flow techniquecannot be used for image transformation since the optical flow techniquetakes into account only the local movement of an object. However, globalcorrespondence can also be detected by utilizing the critical pointfilter according to the base technology.

[0119] [1.3.3] Determining the Mapping with Multiresolution

[0120] A mapping f_(min) which gives the minimum energy and satisfiesthe BC is searched by using the multiresolution hierarchy. The mappingbetween the source subimage and the destination subimage at each levelof the resolution is computed. Starting from the top of the resolutionhierarchy (i.e., the coarsest level), the mapping is determined at eachresolution level, and where possible, mappings at other levels areconsidered. The number of candidate mappings at each level is restrictedby using the mappings at an upper (i.e., coarser) level of thehierarchy. More specifically speaking, in the course of determining amapping at a certain level, the mapping obtained at the coarser level byone is imposed as a sort of constraint condition.

[0121] We thus define a parent and child relationship between resolutionlevels. When the following equation (15) holds, $\begin{matrix}{{\left( {i^{\prime},j^{\prime}} \right) = \left( {\left\lfloor \frac{i}{2} \right\rfloor,\left\lfloor \frac{j}{2} \right\rfloor} \right)},} & (15)\end{matrix}$

[0122] where └x┘ denotes the largest integer not exceeding x,p_((i^(′), j^(′)))^((m − 1, s))  and  q_((i^(′), j^(′)))^((m − 1, s))

[0123] are respectively called the parents ofp_((i, j))^((m, s))  and  q_((i, j))^((m, s)), .

[0124] Conversely, p_((i, j))^((m, s))  and  q_((i, j))^((m, s))

[0125] are the child of p_((i′,j′)) ^((m−1,s)) and the child ofq_((i′,j′)) ^((m−1,s)) respectively. A function parent (i,j) is definedby the following equation (16): $\begin{matrix}{{{parent}\left( {i,j} \right)} = \left( {\left\lfloor \frac{i}{2} \right\rfloor,\left\lfloor \frac{j}{2} \right\rfloor} \right)} & (16)\end{matrix}$

[0126] Now, a mapping between p_((i,j)) ^((m,s)) and q_((k,l)) ^((m,s))is determined by computing the energy and finding the minimum thereof.The value of f^((m,s))(i,j)=(k,l) is determined as follows usingf(m−1,s) (m=1,2, . . . , n). First of all, a condition is imposed thatq_((k,l)) ^((m,s)) should lie inside a quadrilateral defined by thefollowing definitions (17) and (18). Then, the applicable mappings arenarrowed down by selecting ones that are thought to be reasonable ornatural among them satisfying the BC. $\begin{matrix}{q_{g^{({m,s})}{({{i - 1},{j - 1}})}}^{({m,s})}q_{g^{({m,s})}{({{i - 1},{j + 1}})}}^{({m,s})}q_{g^{({m,s})}{({{i + 1},{j + 1}})}}^{({m,s})}q_{g^{({m,s})}{({{i + 1},{j - 1}})}}^{({m,s})}} & (77)\end{matrix}$

[0127] where $\begin{matrix}{{g^{({m,s})}\left( {i,j} \right)} = {{f^{({{m - 1},s})}\left( {{parent}\left( {i,j} \right)} \right)} + {f^{({{m - 1},s})}\left( {{{parent}\left( {i,j} \right)} + \left( {1,1} \right)} \right)}}} & (18)\end{matrix}$

[0128] The quadrilateral defined above is hereinafter referred to as theinherited quadrilateral of p_((i,j)) ^((m,s)). The pixel minimizing theenergy is sought and obtained inside the inherited quadrilateral.

[0129]FIG. 3 illustrates the above-described procedures. The pixels A,B, C and D of the source image are mapped to A′, B′, C′ and D′ of thedestination image, respectively, at the (m−1)th level in the hierarchy.The pixel p_((i,j)) ^((m,s)) should be mapped to the pixel q_(f)_(^((m))) _((i,j)) ^((m,s)) which exists inside the inheritedquadrilateral A′B′C′D′. Thereby, bridging from the mapping at the(m−1)th level to the mapping at the m-th level is achieved.

[0130] The energy E₀ defined above may now be replaced by the followingequations (19) and (20): $\begin{matrix}{E_{0{({i,j})}} = {{{f^{({m,0})}\left( {i,j} \right)} - {g^{(m)}\left( {i,j} \right)}}}^{2}} & (19) \\{{E_{0{({i,j})}} = {{{f^{({m,s})}\left( {i,j} \right)} - {f^{({m,{s - 1}})}\left( {i,j} \right)}}}^{2}},\left( {1 \leq i} \right)} & (20)\end{matrix}$

[0131] for computing the submapping f^((m,0)) and the submappingf^((m,s)) at the m-th level, respectively.

[0132] In this manner, a mapping which maintains a low energy of all thesubmappings is obtained. Using the equation (20) makes the submappingscorresponding to the different critical points associated to each otherwithin the same level in order that the subimages can have highsimilarity. The equation (19) represents the distance betweenf^((m,s))(i,j) and the location where (i,j) should be mapped whenregarded as a part of a pixel at the (m−1)the level.

[0133] When there is no pixel satisfying the BC inside the inheritedquadrilateral A′B′C′D′, the following steps are taken. First, pixelswhose distance from the boundary of A′B′C′D′ is L (at first, L=1) areexamined. If a pixel whose energy is the minimum among them satisfiesthe BC, then this pixel will be selected as a value of f^((m,s))(i,j). Lis increased until such a pixel is found or L reaches its upper boundL_(max) ^((m)). L_(max) ^((m))is fixed for each level m. If no pixel isfound at all, the third condition of the BC is ignored temporarily andsuch mappings that caused the area of the transformed quadrilateral tobecome zero (a point or a line) will be permitted so as to determinef^((m,s))(i,j). If such a pixel is still not found, then the first andthe second conditions of the BC will be removed.

[0134] Multiresolution approximation is essential to determining theglobal correspondence of the images while preventing the mapping frombeing affected by small details of the images. Without themultiresolution approximation, it is impossible to detect acorrespondence between pixels whose distances are large. In the casewhere the multiresolution approximation is not available, the size of animage will generally be limited to a very small size, and only tinychanges in the images can be handled. Moreover, imposing smoothness onthe mapping usually makes it difficult to find the correspondence ofsuch pixels. That is because the energy of the mapping from one pixel toanother pixel which is far therefrom is high. On the other hand, themultiresolution approximation enables finding the approximatecorrespondence of such pixels. This is because the distance between thepixels is small at the upper (coarser) level of the hierarchy of theresolution.

[0135] [1.4] Automatic Determination of the Optimal Parameter Values

[0136] One of the main deficiencies of the existing image matchingtechniques lies in the difficulty of parameter adjustment. In mostcases, the parameter adjustment is performed manually and it isextremely difficult to select the optimal value. However, according tothe base technology, the optimal parameter values can be obtainedcompletely automatically.

[0137] The systems according to this base technology include twoparameters, namely, λ and η, where λ and η represent the weight of thedifference of the pixel intensity and the stiffness of the mapping,respectively. In order to automatically determine these parameters, theare initially set to 0. First, λ is gradually increased from λ=0 while ηis fixed at 0. As λ becomes larger and the value of the combinedevaluation equation (equation (14)) is minimized, the value of C_(f)^((m,s)) for each submapping generally becomes smaller. This basicallymeans that the two images are matched better. However, if λ exceeds theoptimal value, the following phenomena occur:

[0138] 1. Pixels which should not be corresponded are erroneouslycorresponded only because their intensities are close.

[0139] 2. As a result, correspondence between images becomes inaccurate,and the mapping becomes invalid.

[0140] 3. As a result, D_(f) ^((m,s)) in equation (14) tends to increaseabruptly.

[0141] 4. As a result, since the value of equation (14) tends toincrease abruptly, f^((m,s)) changes in order to suppress the abruptincrease of D_(f) ^((m,s)). As a result, C_(f) ^((m,s)) increases.

[0142] Therefore, a threshold value at which C_(f) ^((m,s)) turns to anincrease from a decrease is detected while a state in which equation(14) takes the minimum value with λ being increased is kept. Such λ isdetermined as the optimal value at η=0. Next, the behavior of C_(f)^((m,s)) is examined while η is increased gradually, and η will beautomatically determined by a method described later. λ will then againbe determined corresponding to such an automatically determined η.

[0143] The above-described method resembles the focusing mechanism ofhuman visual systems. In the human visual systems, the images of therespective right eye and left eye are matched while moving one eye. Whenthe objects are clearly recognized, the moving eye is fixed.

[0144] [1.4.1] Dynamic Determination of λ

[0145] Initially, λ is increased from 0 at a certain interval, and asubimage is evaluated each time the value of λ changes. As shown inequation (14), the total energy is defined byλ  C_(f)^((m, s)) + D_(f)^((m, s)).D_((i, j))^((m, s))

[0146] in equation (9) represents the smoothness and theoreticallybecomes minimum when it is the identity mapping. E₀ and E₁ increase asthe mapping is further distorted. Since E₁ is an integer, 1 is thesmallest step of D_(f) ^((m,s)). Thus, it is impossible to change themapping to reduce the total energy unless a changed amount (reductionamount) of the current λC_((i,j)) ^((m,s)) is equal to or greaterthan 1. Since D_(f) ^((m,s)) increases by more than 1 accompanied by thechange of the mapping, the total energy is not reduced unless λC_((i,j))^((m,s)) is reduced by more than 1.

[0147] Under this condition, it is shown that C_((i,j)) ^((m,s))decreases in normal cases as λ increases. The histogram of C_((i,j))^((m,s)) is denoted as h(l), where h(l) is the number of pixels whoseenergy C_((i,j)) ^((m,s)) is l². In order that λl²≧1 for example, thecase of l²=1/λ is considered. When λ varies from λ₁ to λ₂, a number ofpixels (denoted A) expressed by the following equation (21):$\begin{matrix}{A = {{{\sum\limits_{l = {\lceil\frac{1}{\lambda_{2}}\rceil}}^{\lfloor\frac{1}{\lambda_{1}}\rfloor}{h(l)}} \cong {\int_{l = \frac{1}{\lambda_{2}}}^{\frac{1}{\lambda_{1}}}{{h(l)}{l}}}} = {{- {\int_{\lambda_{2}}^{\lambda_{1}}{{h(l)}\frac{1}{\lambda^{3/2}}{\lambda}}}} = {\int_{\lambda_{1}}^{\lambda_{2}}{\frac{h(l)}{\lambda^{3/2}}{\lambda}}}}}} & (21)\end{matrix}$

[0148] changes to a more stable state having the energy shown inequation(22) $\begin{matrix}{{C_{f}^{({m,s})} - l^{2}} = {C_{f}^{({m,s})} - {\frac{1}{\lambda}.}}} & (22)\end{matrix}$

[0149] Here, it is assumed that the energy of these pixels isapproximated to be zero. This means that the value of C_((i,j)) ^((m,s))changes by: $\begin{matrix}{{\partial C_{f}^{({m,s})}} = {- \frac{A}{\lambda}}} & (23)\end{matrix}$

[0150] As a result, equation (24) holds. $\begin{matrix}{\frac{\partial C_{f}^{({m,s})}}{\partial\lambda} = {- \frac{h(l)}{\lambda^{5/2}}}} & (24)\end{matrix}$

[0151] Since h(l)>0,C_(f) ^((m,s)) decreases in the normal case.However, when λ exceeds the optimal value, the above phenomenon, thatis, an increase in C_(f) ^((m,s)) occurs. The optimal value of λ isdetermined by detecting this phenomenon.

[0152] When $\begin{matrix}{{h(l)} = {{H\quad l^{k}} = \frac{H}{\lambda^{k/2}}}} & (25)\end{matrix}$

[0153] is assumed, where both H(H>0) and k are constants, the equation(26) holds: $\begin{matrix}{\frac{\partial C_{f}^{({m,s})}}{\partial\lambda} = {- \frac{H}{\lambda^{{5/2} + {k/2}}}}} & (26)\end{matrix}$

[0154] Then, if k≠−3, the following equation (27) holds: $\begin{matrix}{C_{f}^{({m,s})} = {C + \frac{H}{\left( {{3/2} + {k/2}} \right)\lambda^{{3/2} + {k/2}}}}} & (27)\end{matrix}$

[0155] The equation (27) is a general equation of C_(f)^((m, s))

[0156] (where C is a constant).

[0157] When detecting the optimal value of λ, the number of pixelsviolating the BC may be examined for safety. In the course ofdetermining a mapping for each pixel, the probability of violating theBC is assumed as a value po here. In this case, since $\begin{matrix}{\frac{\partial A}{\partial\lambda} = \frac{h(l)}{\lambda^{3/2}}} & (28)\end{matrix}$

[0158] holds, the number of pixels violating the BC increases at a rateof: $\begin{matrix}{B_{0} = \frac{{h(l)}p_{0}}{\lambda^{3/2}}} & (29)\end{matrix}$

[0159] Thus, $\begin{matrix}{\frac{B_{0}\lambda^{3/2}}{p_{0}{h(l)}} = 1} & (30)\end{matrix}$

[0160] is a constant. If it is assumed that h(l)=Hl^(k), the followingequation (31), for example,

B ₀λ^(3/2+k/2) =p ₀ H  (31)

[0161] becomes a constant. However, when λ exceeds the optimal value,the above value of equation (31) increases abruptly. By detecting thisphenomenon, i.e. whether or not the value of B₀λ^(3/2 + k/2)/2^(m)

[0162] exceeds an abnormal value B_(0thres), the optimal value of λ canbe determined. Similarly, whether or not the value ofB₁λ^(3/2 + k/2)/2^(m)

[0163] exceeds an abnormal value B_(1thres) can be used to check for anincreasing rate B₁ of pixels violating the third condition of the BC.The reason why the factor 2^(m) is introduced here will be described ata later stage. This system is not sensitive to the two threshold valuesB_(0thres) and B_(1thres). The two threshold values B_(0thres) andB_(1thres) can be used to detect excessive distortion of the mappingwhich may not be detected through observation of the energy C_(f)^((m,s)).

[0164] In the experimentation, when λ exceeded 0.1 the computation off^((m,s)) was stopped and the computation of f^((m,s+1)) was started.That is because the computation of submappings is affected by adifference of only 3 out of 255 levels in pixel intensity when λ>0.1 andit is then difficult to obtain a correct result.

[0165] [1.4.2] Histogram h(l)

[0166] The examination of C_(f) ^((m,s)) does not depend on thehistogram h(l), however, the examination of the BC and its thirdcondition may be affected by h(l). When (λ, C_(f) ^((m,s))) is actuallyplotted, k is usually close to 1. In the experiment, k=1 is used, thatis, B₀λ² and B₁λ² are examined. If the true value of k is less than 1,B₀λ² and B₁λ² are not constants and increase gradually by a factor ofλ^((1−k)/2). If h(l) is a constant, the factor is, for example, λ^(½).However, such a difference can be absorbed by setting the thresholdB_(0thres) appropriately.

[0167] Let us model the source image by a circular object, with itscenter at (x₀, y₀) and its radius r, given by: $\begin{matrix}{{p\left( {i,j} \right)} = \left\{ \begin{matrix}{\frac{255}{r}c\quad \left( \sqrt{\left( {i - x_{0}} \right)^{2} + \left( {j - y_{0}} \right)^{2}} \right)\ldots \quad \left( {\sqrt{\left( {i - x_{0}} \right)^{2} + \left( {j - y_{0}} \right)^{2}} \leq r} \right)} \\{0\ldots \quad ({otherwise})}\end{matrix} \right.} & (32)\end{matrix}$

[0168] and the destination image given by: $\begin{matrix}{{q\left( {i,j} \right)} = \left\{ \begin{matrix}{\frac{255}{r}c\quad \left( \sqrt{\left( {i - x_{1}} \right)^{2} + \left( {j - y_{1}} \right)^{2}} \right)\ldots \quad \left( {\sqrt{\left( {i - x_{1}} \right)^{2} + \left( {j - y_{1}} \right)^{2}} \leq r} \right)} \\{0\ldots \quad ({otherwise})}\end{matrix} \right.} & (33)\end{matrix}$

[0169] with its center at (x₁,y₁) and radius r. In the above, let c(x)have the form of c(x)=x^(k). When the centers (x₀,y₀) and (x₁,y₁) aresufficiently far from each other, the histogram h(l) is then in theform:

h(l)∝rl ^(k)(k≠0)  (34)

[0170] When k=1, the images represent objects with clear boundariesembedded in the background. These objects become darker toward theircenters and brighter toward their boundaries. When k=−1, the imagesrepresent objects with vague boundaries. These objects are brightest attheir centers, and become darker toward their boundaries. Without muchloss of generality, it suffices to state that objects in images aregenerally between these two types of objects. Thus, choosing k such that—1≦k≦1 can cover most cases and the equation (27) is generally adecreasing function for this range.

[0171] As can be observed from the above equation (34), attention mustbe directed to the fact that r is influenced by the resolution of theimage, that is, r is proportional to 2^(m). This is the reason for thefactor 2^(m) being introduced in the above section [1.4.1].

[0172] [1.4.3] Dynamic Determination of η

[0173] The parameter η can also be automatically determined in a similarmanner. Initially, η is set to zero, and the final mapping f^((n)) andthe energy C_(f) ^((n)) at the finest resolution are computed. Then,after η is increased by a certain value Δη, the final mapping f^((n))and the energy C_(f) ^((n)) at the finest resolution are again computed.This process is repeated until the optimal value of η is obtained. ηrepresents the stiffness of the mapping because it is a weight of thefollowing equation (35): $\begin{matrix}{E_{0_{({i,j})}}^{({m,s})} = {{{f^{({m,s})}\left( {i,j} \right)} - {f^{({m,{s - 1}})}\left( {i,j} \right)}}}^{2}} & (35)\end{matrix}$

[0174] If η is zero, D_(f) ^((n)) is determined irrespective of theprevious submapping, and the present submapping may be elasticallydeformed and become too distorted. On the other hand, if η is a verylarge value, D_(f) ^((n)) is almost completely determined by theimmediately previous submapping. The submappings are then very stiff,and the pixels are mapped to almost the same locations. The resultingmapping is therefore the identity mapping. When the value of η increasesfrom 0, C_(f) ^((n)) gradually decreases as will be described later.However, when the value of η exceeds the optimal value, the energystarts increasing as shown in FIG. 4. In FIG. 4, the x-axis representsη, and y-axis represents C_(f).

[0175] The optimum value of η which minimizes C_(f) ^((n)) can beobtained in this manner. However, since various elements affect thiscomputation as compared to the case of λ, C_(f) ^((n)) changes whileslightly fluctuating. This difference is caused because a submapping isre-computed once in the case of λ whenever an input changes slightly,whereas all the submappings must be re-computed in the case of λ. Thus,whether the obtained value of C_(f) ^((n)) is the minimum or not cannotbe determined as easily. When candidates for the minimum value arefound, the true minimum needs to be searched by setting up further finerintervals.

[0176] [1.5] Supersampling

[0177] When deciding the correspondence between the pixels, the range off_((m,s)) can be expanded to R×R (R being the set of real numbers) inorder to increase the degree of freedom. In this case, the intensity ofthe pixels of the destination image is interpolated, to providef^((m,s)) having an intensity at non-integer points: $\begin{matrix}{V\left( q_{f^{({m,s})}{({i,j})}}^{({m,s})} \right)} & (36)\end{matrix}$

[0178] That is, supersampling is performed. In an exampleimplementation, f^((m,s)) may take integer and half integer values, and$\begin{matrix}{V\left( q_{{({i,j})} + {({0.5{.0}{.5}})}}^{({m,s})} \right)} & (37)\end{matrix}$

[0179] is given by $\begin{matrix}{\left( {{V\left( q_{({i,j})}^{({m,s})} \right)} + {V\left( q_{{({i,j})} + {({1,1})}}^{({m,s})} \right)}} \right)/2} & (38)\end{matrix}$

[0180] [1.6] Normalization of the Pixel Intensity of Each Image

[0181] When the source and destination images contain quite differentobjects, the raw pixel intensity may not be used to compute the mappingbecause a large difference in the pixel intensity causes excessivelylarge energy C_(f) ^((m,s)) and thus making it difficult to obtain anaccurate evaluation.

[0182] For example, a matching between a human face and a cat's face iscomputed as shown in FIGS. 20(a) and 20(b). The cat's face is coveredwith hair and is a mixture of very bright pixels and very dark pixels.In this case, in order to compute the submappings of the two faces,subimages are normalized. That is, the darkest pixel intensity is set to0 while the brightest pixel intensity is set to 255, and other pixelintensity values are obtained using linear interpolation.

[0183] [1.7] Implementation

[0184] In an example implementation, a heuristic method is utilizedwherein the computation proceeds linearly as the source image isscanned. First, the value of f^((m,s)) is determined at the top leftmostpixel (i,j)=(0,0). The value of each f^((m,s))(i,j) is then determinedwhile i is increased by one at each step. When i reaches the width ofthe image, j is increased by one and i is reset to zero. Thereafter,f^((m,s))(i,j) is determined while scanning the source image. Once pixelcorrespondence is determined for all the points, it means that a singlemapping f^((m,s)) is determined.

[0185] When a corresponding point q_(f(i,j)) is determined forp_(f(i,j)), a corresponding point q_(f(i,j+1)) of p_((i,j+1)) isdetermined next. The position of q_(f(i,j+1)) is constrained by theposition of qf(i,j) since the position of q_(f(i,j+1)) satisfies the BC.Thus, in this system, a point whose corresponding point is determinedearlier is given higher priority. If the situation continues in which(0,0) is always given the highest priority, the final mapping might beunnecessarily biased. In order to avoid this bias, f^((m,s)) isdetermined in the following manner in the base technology.

[0186] First, when (s mod 4) is 0, f^((m,s)) is determined starting from(0,0) while gradually increasing both i and j. When (s mod 4) is 1,f^((m,s)) is determined starting from the top rightmost location whiledecreasing i and increasing j. When (s mod 4) is 2, f^((m,s)) isdetermined starting from the bottom rightmost location while decreasingboth i and j. When (s mod 4) is 3, f^((m,s)) is determined starting fromthe bottom leftmost location while increasing i and decreasing j. Sincea concept such as the submapping, that is, a parameter s, does not existin the finest n-th level, f^((m,s)) is computed continuously in twodirections on the assumption that s=0 and s=2.

[0187] In this implementation, the values of f^((m,s))(i,j) (m=0, . . ., n) that satisfy the BC are chosen as much as possible from thecandidates (k,l) by imposing a penalty on the candidates violating theBC. The energy D_((k,l)) of a candidate that violates the thirdcondition of the BC is multiplied by φ and that of a candidate thatviolates the first or second condition of the BC is multiplied by ψ. Inthis implementation, φ=2 and ψ=100000 are used.

[0188] In order to check the above-mentioned BC, the following test maybe performed as the procedure when determining (k,l)=f^((m,s))(i,j).Namely, for each grid point (k,l) in the inherited quadrilateral off^((m,s))(i,j), whether or not the z-component of the outer product of

W={right arrow over (A)}×{right arrow over (B)}  (39)

[0189] is equal to or greater than 0 is examined, where $\begin{matrix}{\overset{\rightarrow}{A} = \overset{\rightarrow}{q_{f^{({m,s})}{({i,{j - 1}})}}^{({m,s})}q_{f^{({m,s})}{({{i + 1},{j - 1}})}}^{({m,s})}}} & (40) \\{\overset{\rightarrow}{B} = \overset{\rightarrow}{q_{f^{({m,s})}{({i,{j - 1}})}}^{({m,s})}q_{({k,l})}^{({m,s})}}} & (41)\end{matrix}$

[0190] Here, the vectors are regarded as 3D vectors and the z-axis isdefined in the orthogonal right-hand coordinate system. When W isnegative, the candidate is imposed with a penalty by multiplyingD_((k, l))^((m, s))

[0191] by ψ so that it is not as likely to be selected.

[0192] FIGS. 5(a) and 5(b) illustrate the reason why this condition isinspected. FIG. 5(a) shows a candidate without a penalty and FIG. 5(b)shows one with a penalty. When determining the mapping f^((m,s))(i,j+1)for the adjacent pixel at (i,j+1), there is no pixel on the source imageplane that satisfies the BC if the z-component of W is negative becausethen q_((k, l))^((m, s)^(.))

[0193] passes the boundary of the adjacent quadrilateral.

[0194] [1.7.1] The Order of Submappings

[0195] In this implementation, σ(0)=0, σ(1)-=1, σ(2)=2, σ(3)=3, σ(4)=0are used when the resolution level is even, while σ(0)=3, σ(1)=2,σ(2)=1,σ(3)=0, σ(4)=3 are used when the resolution level is odd. Thus,the submappings are shuffled to some extent. It is to be noted that thesubmappings are primarily of four types, and s may be any of 0 to 3.However, a processing with s=4 is used in this implementation for areason to be described later.

[0196] [1.8] Interpolations

[0197] After the mapping between the source and destination images isdetermined, the intensity values of the corresponding pixels areinterpolated. In the implementation, trilinear interpolation is used.Suppose that a square p_((i,j))p_((i+1,j))p_((i+1,j+1))p_((i,j+1)) onthe source image plane is mapped to a quadrilateralq_((i,j))q_((i+1,j))q_((i+1,j+1))q_((i,j+1)) on the destination imageplane. For simplicity, the distance between the image planes is assumedto be 1. The intermediate image pixels r(x,y,t) (0≦x≦N−1, 0≦y≦M−1) whosedistance from the source image plane is t (0≦t≦1) are obtained asfollows. First, the location of the pixel r(x,y,t), where x,y,t∈R, isdetermined by equation (42): $\begin{matrix}\begin{matrix}{\left( {x,y} \right) = \quad {{\left( {1 - {dx}} \right)\left( {1 - {dy}} \right)\left( {1 - t} \right)\left( {i,j} \right)} + {\left( {1 - {dx}} \right)\left( {1 - {dy}} \right){{tf}\left( {i,j} \right)}} +}} \\{\quad {{{{dx}\left( {1 - {dy}} \right)}\left( {1 - t} \right)\left( {{i + 1},j} \right)} + {{{dx}\left( {1 - {dy}} \right)}{{tf}\left( {{i + 1},j} \right)}} +}} \\{\quad {{\left( {1 - {dx}} \right){{dy}\left( {1 - t} \right)}\left( {i,{j + 1}} \right)} + {\left( {1 - {dx}} \right){{dytf}\left( {i,{j + 1}} \right)}} +}} \\{\quad {{{{dxdy}\left( {1 - t} \right)}\left( {{i + 1},{j + 1}} \right)} + {{dxdytf}\left( {{i + 1},{j + 1}} \right)}}}\end{matrix} & (42)\end{matrix}$

[0198] The value of the pixel intensity at r(x,y,t) is then determinedby equation (43): $\begin{matrix}\begin{matrix}{{V\left( {r\left( {x,y,t} \right)} \right)} = \quad {{\left( {1 - {dx}} \right)\left( {1 - {dy}} \right)\left( {1 - t} \right){V\left( p_{({i,j})} \right)}} +}} \\{\quad {{\left( {1 - {dx}} \right)\left( {1 - {dy}} \right){{tV}\left( q_{f{({i,j})}} \right)}} +}} \\{\quad {{{{dx}\left( {1 - {dy}} \right)}\left( {1 - t} \right){V\left( p_{({{i + 1},j})} \right)}} +}} \\{\quad {{{{dx}\left( {1 - {dy}} \right)}{{tV}\left( q_{j{({{i + 1},j})}} \right)}} +}} \\{\quad {{\left( {1 - {dx}} \right){{dy}\left( {1 - t} \right)}{V\left( p_{({i,{j + 1}})} \right)}} +}} \\{\quad {{\left( {1 - {dx}} \right){{dytV}\left( q_{f{({i,{j + 1}})}} \right)}} +}} \\{\quad {{{{dxdy}\left( {1 - t} \right)}{V\left( p_{({{i + 1},{j + 1}})} \right)}} + {{dxdytV}\left( q_{f{({{i + 1},{j + 1}})}} \right)}}}\end{matrix} & (43)\end{matrix}$

[0199] where dx and dy are parameters varying from 0 to 1.

[0200] [1.9] Mapping to Which Constraints are Imposed

[0201] So far, the determination of a mapping in which no constraintsare imposed has been described. However, if a correspondence betweenparticular pixels of the source and destination images is provided in apredetermined manner, the mapping can be determined using suchcorrespondence as a constraint.

[0202] The basic idea is that the source image is roughly deformed by anapproximate mapping which maps the specified pixels of the source imageto the specified pixels of the destination image and thereafter amapping f is accurately computed.

[0203] First, the specified pixels of the source image are mapped to thespecified pixels of the destination image, then the approximate mappingthat maps other pixels of the source image to appropriate locations aredetermined. In other words, the mapping is such that pixels in thevicinity of a specified pixel are mapped to locations near the positionto which the specified one is mapped. Here, the approximate mapping atthe m-th level in the resolution hierarchy is denoted by F^((m)).

[0204] The approximate mapping F is determined in the following manner.First, the mappings for several pixels are specified. When n_(s) pixels$\begin{matrix}{{p\left( {i_{0},j_{0}} \right)},{p\left( {i_{1},j_{1}} \right)},\ldots \quad,{p\left( {i_{n_{s} - 1},j_{n_{s} - 1}} \right)}} & (44)\end{matrix}$

[0205] of the source image are specified, the following values in theequation (45) are determined. $\begin{matrix}{{{F^{(n)}\left( {i_{0},j_{0}} \right)} = \left( {k_{0},l_{0}} \right)},{{F^{(n)}\left( {i_{1},j_{1}} \right)} = \left( {k_{1},l_{1}} \right)},\ldots \quad,{{F^{(n)}\left( {i_{n_{s} - 1},j_{n_{s} - 1}} \right)} = \left( {k_{n_{s} - 1},l_{n_{s} - 1}} \right)}} & (45)\end{matrix}$

[0206] For the remaining pixels of the source image, the amount ofdisplacement is the weighted average of the displacement of p(i_(h),i_(h)) (h=0, . . . , n_(s)−1). Namely, a pixel p_((i,j)) is mapped tothe following pixel (expressed by the equation (46)) of the destinationimage. $\begin{matrix}{{F^{(m)}\left( {i,j} \right)} = \frac{\left( {i,j} \right) + {\sum\limits_{h = 0}^{h - n_{s} - 1}{\left( {{k_{h} - i_{h}},{l_{h} - j_{h}}} \right){{weight}_{h}\left( {i,j} \right)}}}}{2^{n - m}}} & (46)\end{matrix}$

[0207] where $\begin{matrix}{{{weight}_{h}\left( {i,j} \right)} = \frac{1/{\left( {{i_{h} - i},{j_{h} - j}} \right)}^{2}}{{total\_ weight}\left( {i,j} \right)}} & (47)\end{matrix}$

[0208] where $\begin{matrix}{{{total\_ weight}\left( {i,j} \right)} = {\sum\limits_{h = 0}^{h = {n_{s} - 1}}{1/{\left( {{i_{h} - i},{j_{h} - j}} \right)}^{2}}}} & (48)\end{matrix}$

[0209] Second, the energy D_((i,j)) ^((m,s)) of the candidate mapping fis changed so that a mapping f similar to F^((m)) has a lower energy.Precisely speaking, D_((i,j)) ^((m,s)) is expressed by the equation(49): $\begin{matrix}{D_{({i,j})}^{({m,s})} = {E_{0_{({i,j})}}^{({m,s})} + {\eta \quad E_{1_{({i,j})}}^{({m,s})}} + {\kappa \quad E_{2_{({i,j})}}^{({m,s})}}}} & (49)\end{matrix}$

[0210] where $\begin{matrix}{E_{2_{({i,j})}}^{({m,s})} = \left\{ \begin{matrix}{0,} & {{{if}\quad {{{F^{(m)}\left( {i,j} \right)} - {f^{({m,s})}\left( {i,j} \right)}}}^{2}} \leq \left\lfloor \frac{\rho^{2}}{2^{2{({n - m})}}} \right\rfloor} \\{{{{F^{(m)}\left( {i,j} \right)} - {f^{({m,s})}\left( {i,j} \right)}}}^{2},} & {otherwise}\end{matrix} \right.} & (50)\end{matrix}$

[0211] where κ, ρ≧0. Finally, the resulting mapping f is determined bythe above-described automatic computing process.

[0212] Note that E₂ _((i,j)) ^((m,s)) becomes 0 if f^((m,s))(i,j) issufficiently close to F^((m))(i,j) i.e., the distance therebetween isequal to or less than $\begin{matrix}{\left\lfloor \frac{\rho^{2}}{2^{2{({n - m})}}} \right\rfloor.} & (51)\end{matrix}$

[0213] This has been defined in this way because it is desirable todetermine each value f^((m,s))(i,j) automatically to fit in anappropriate place in the destination image as long as each valuef^((m,s))(i,j) is close to F^((m))(i,j). For this reason, there is noneed to specify the precise correspondence in detail to have the sourceimage automatically mapped so that the source image matches thedestination image.

[0214] [2] Concrete Processing Procedure

[0215] The flow of a process utilizing the respective elementaltechniques described in [1] will now be described.

[0216]FIG. 6 is a flowchart of the overall procedure of the basetechnology. Referring to FIG. 6, a source image and destination imageare first processed using a multiresolutional critical point filter(S1). The source image and the destination image are then matched (S2).As will be understood, the matching (S2) is not required in every case,and other processing such as image recognition may be performed instead,based on the characteristics of the source image obtained at S1.

[0217]FIG. 7 is a flowchart showing details of the process S1 shown inFIG. 6. This process is performed on the assumption that a source imageand a destination image are matched at S2. Thus, a source image is firsthierarchized using a critical point filter (S10) so as to obtain aseries of source hierarchical images. Then, a destination image ishierarchized in the similar manner (S11) so as to obtain a series ofdestination hierarchical images. The order of S10 and S11 in the flow isarbitrary, and the source image and the destination image can begenerated in parallel. It may also be possible to process a number ofsource and destination images as required by subsequent processes.

[0218]FIG. 8 is a flowchart showing details of the process at S10 shownin FIG. 7. Suppose that the size of the original source image is2^(n)×2^(n). Since source hierarchical images are sequentially generatedfrom an image with a finer resolution to one with a coarser resolution,the parameter m which indicates the level of resolution to be processedis set to n (S100). Then, critical points are detected from the imagesp^((m,0)), p^((m,1)), p^((m,2)) and p^((m,3)) of the m-th level ofresolution, using a critical point filter (S101), so that the imagesp^((m−1,0)), p^((m−1,1)), p^((m−1,2)) and p of the (m−1)th level aregenerated (S102). Since m=n here,p^((m,0))=p^((m,1))=p^((m,2))=p^((m,3))=p^((n)) holds and four types ofsubimages are thus generated from a single source image.

[0219]FIG. 9 shows correspondence between partial images of the m-th andthose of (m−1)th levels of resolution. Referring to FIG. 9, respectivenumberic values shown in the figure represent the intensity ofrespective pixels. p^((m,s)) symbolizes any one of four images p^((m,0))through p^((m,3)), and when generating p^((m−1,0)), p^((m,0)) is usedfrom p^((m,s)). For example, as for the block shown in FIG. 9,comprising four pixels with their pixel intensity values indicatedinside, images p^((m−1,0)), p^((m−1,1)), p^((m−1,2)) and p^((m−1,3))acquire “3”, “8”, “6” and “10”, respectively, according to the rulesdescribed in [1.2]. This block at the m-th level is replaced at the(m-l)th level by respective single pixels thus acquired. Therefore, thesize of the subimages at the (m−1)th level is 2^(m−1)×2^(m−1).

[0220] After m is decremented (S103 in FIG. 8), it is ensured that m isnot negative (S104). Thereafter, the process returns to S101, so thatsubimages of the next level of resolution, i.e., a next coarser level,are generated. The above process is repeated until subimages at m=0(0-thlevel) are generated to complete the process at S10. The size of thesubimages at the 0-th level is 1×1.

[0221]FIG. 10 shows source hierarchical images generated at S10 in thecase of n=3. The initial source image is the only image common to thefour series followed. The four types of subimages are generatedindependently, depending on the type of critical point. Note that theprocess in FIG. 8 is common to S11 shown in FIG. 7, and that destinationhierarchical images are generated through a similar procedure. Then, theprocess at S1 in FIG. 6 is completed.

[0222] In this base technology, in order to proceed to S2 shown in FIG.6 a matching evaluation is prepared. FIG. 11 shows the preparationprocedure. Referring to FIG. 11, a plurality of evaluation equations areset (S30). The evaluation equations may include the energy C_(f)^((m,s)) concerning a pixel value, introduced in [1.3.2.1], and theenergy D_(f) ^((m,s)) concerning the smoothness of the mappingintroduced in [1.3.2.2]. Next, by combining these evaluation equations,a combined evaluation equation is set (S31). Such a combined evaluationequation may be λC_((i,j)) ^((m,s))+D_(f) ^((m,s)). Using η introducedin [1.3.2.2], we have $\begin{matrix}{\sum{\sum\left( {{\lambda \quad C_{({i,j})}^{({m,s})}} + {\eta \quad E_{0_{({i,j})}}^{({m,s})}} + E_{1_{({i,j})}}^{({m,s})}} \right)}} & (52)\end{matrix}$

[0223] In the equation (52) the sum is taken for each i and j where iand j run through 0, 1, . . . , 2^(m−1). Now, the preparation formatching evaluation is completed.

[0224]FIG. 12 is a flowchart showing the details of the process of S2shown in FIG. 6. As described in [1], the source hierarchical images anddestination hierarchical images are matched between images having thesame level of resolution. In order to detect global correspondencecorrectly, a matching is calculated in sequence from a coarse level to afine level of resolution. Since the source and destination hierarchicalimages are generated using the critical point filter, the location andintensity of critical points are stored clearly even at a coarse level.Thus, the result of the global matching is superior to conventionalmethods.

[0225] Referring to FIG. 12, a coefficient parameter n and a levelparameter m are set to 0(S20). Then, a matching is computed between thefour subimages at the m-th level of the source hierarchical images andthose of the destination hierarchical images at the m-th level, so thatfour types of submappings f^((m,s))(s=0, 1, 2, 3) which satisfy the BCand minimize the energy are obtained (S21). The BC is checked by usingthe inherited quadrilateral described in [1.3.3]. In that case, thesubmappings at the m-th level are constrained by those at the (m−1)thlevel, as indicated by the equations (17) and (18). Thus, the matchingcomputed at a coarser level of resolution is used in subsequentcalculation of a matching. This is called a vertical reference betweendifferent levels. If m=0, there is no coarser level and this exceptionalcase will be described using FIG. 13.

[0226] A horizontal reference within the same level is also performed.As indicated by the equation (20) in [1.3.3], f^((m,3)), f^((m,2)) andf^((m,1)) are respectively determined so as to be analogous tof^((m,2)), f^((m,1)) and f^((m,0)). This is because a situation in whichthe submappings are totally different seems unnatural even though thetype of critical points differs so long as the critical points areoriginally included in the same source and destination images. As canbeen seen from the equation (20), the closer the submappings are to eachother, the smaller the energy becomes, so that the matching is thenconsidered more satisfactory.

[0227] As for f^((m,0)), which is to be initially determined, a coarserlevel by one may be referred to since there is no other submapping atthe same level to be referred to as shown in the equation (19). In thisbase technology, however, a procedure is adopted such that after thesubmappings were obtained up to f^((m,3)), f^((m,0)) is recalculatedonce utilizing the thus obtained subamppings as a constraint. Thisprocedure is equivalent to a process in which s=4 is substituted intothe equation (20) and f^((m,4)) is set to f^((m,0)) anew. The aboveprocess is employed to avoid the tendency in which the degree ofassociation between f^((m,0)) and f^((m,3)) becomes too low. This schemeactually produced a preferable result. In addition to this scheme, thesubmappings are shuffled in the experiment as described in [1.7.1], soas to closely maintain the degrees of association among submappingswhich are originally determined independently for each type of criticalpoint. Furthermore, in order to prevent the tendency of being dependenton the starting point in the process, the location thereof is changedaccording to the value of s as described in [1.7].

[0228]FIG. 13 illustrates how the submapping is determined at the 0-thlevel. Since at the 0-th level each sub-image is consitituted by asingle pixel, the four submappings f^((0,s)) are automatically chosen asthe identity mapping. FIG. 14 shows how the submappings are determinedat the first level. At the first level, each of the sub-images isconstituted of four pixels, which are indicated by solid lines. When acorresponding point (pixel) of the point (pixel)×in p^((1,s)) issearched within q^((1,s)), the following procedure is adopted:

[0229] 1. An upper left point a, an upper right point b, a lower leftpoint c and a lower right point d with respect to the point x areobtained at the first level of resolution.

[0230] 2. Pixels to which the points a to d belong at a coarser level byone, i.e., the 0-th level, are searched. In FIG. 14, the points a to dbelong to the pixels A to D, respectively. However, the pixels A to Care virtual pixels which do not exist in reality.

[0231] 3. The corresponding points A′ to D′ of the pixels A to D, whichhave already been defined at the O-th level, are plotted in q^((1,s)).The pixels A′ to C′ are virtual pixels and regarded to be located at thesame positions as the pixels A to C.

[0232] 4. The corresponding point a′ to the point a in the pixel A isregarded as being located inside the pixel A′, and the point a′ isplotted. Then, it is assumed that the position occupied by the point ain the pixel A (in this case, positioned at the lower right) is the sameas the position occupied by the point a′ in the pixel A′.

[0233] 5. The corresponding points b′ to d′ are plotted by using thesame method as the above 4 so as to produce an inherited quadrilateraldefined by the points a′ to d′.

[0234] 6. The corresponding point x′ of the point x is searched suchthat the energy becomes minimum in the inherited quadrilateral.Candidate corresponding points x′ may be limited to the pixels, forinstance, whose centers are included in the inherited quadrilateral. Inthe case shown in FIG. 14, the four pixels all become candidates.

[0235] The above described is a procedure for determining thecorresponding point of a given point x. The same processing is performedon all other points so as to determine the submappings. As the inheritedquadrilateral is expected to become deformed at the upper levels (higherthan the second level), the pixels A′ to D′ will be positioned apartfrom one another as shown in FIG. 3.

[0236] Once the four submappings at the m-th level are determined inthis manner, m is incremented (S22 in FIG. 12). Then, when it isconfirmed that m does not exceed n (S23), return to S21. Thereafter,every time the process returns to S21, submappings at a finer level ofresolution are obtained until the process finally returns to S21 atwhich time the mapping f^((n)) at the n-th level is determined. Thismapping is denoted as f^((n))(η=0) because it has been determinedrelative to η=0.

[0237] Next, to obtain the mapping with respect to other different η, ηis shifted by Δη and m is reset to zero (S24). After confirming that newη does not exceed a predetermined search-stop value η_(max)(S25), theprocess returns to S21 and the mapping f^((n))(η=Δη) relative to the newη is obtained. This process is repeated while obtainingf^((n))(η=iΔη)(i=0,1, . . . ) at S21. When η exceeds η_(max), theprocess proceeds to S26 and the optimal η=η_(opt) is determined using amethod described later, so as to let f^((n))(η=η_(opt)) be the finalmapping f^((n)).

[0238]FIG. 15 is a flowchart showing the details of the process of S21shown in FIG. 12. According to this flowchart, the submappings at them-th level are determined for a certain predetermined η. In this basetechnology, when determining the mappings, the optimal λ is definedindependently for each submapping.

[0239] Referring to FIG. 15, s and λ are first reset to zero (S210).Then, obtained is the submapping f^((m,s)) that minimizes the energywith respect to the then λ (and, implicitly, η) (S211), and the thusobtained submapping is denoted as f^((m,s))(λ=0). In order to obtain themapping with respect to other different λ, λ is shifted by Δλ. Afterconfirming that the new λ does not exceed a predetermined search-stopvalue λ_(max)(S213), the process returns to S211 and the mappingf^((m,s))(λ=Δλ) relative to the new λ is obtained. This process isrepeated while obtaining f^((m,s))(λ=iΔλ)(i=0,1, . . . ). When λ exceedsλ_(max), the process proceeds to S214 and the optimal λ=λ_(opt) isdetermined, so as to let f^((n))(λ=λ_(opt)) be the final mappingf^((m,s))(S214).

[0240] Next, in order to obtain other submappings at the same level, λis reset to zero and s is incremented (S215). After confirming that sdoes not exceed 4(S216), return to S211. When s=4, f^((m,0)) is renewedutilizing f^((m,3)) as described above and a submapping at that level isdetermined.

[0241]FIG. 16 shows the behavior of the energy C_(f) ^((m,s))corresponding to f^((m,s))(λ=iΔλ)(i=0,1, . . . ) for a certain m and swhile varying λ. As described in [1.4], as λ increases, C_(f) ^((m,s))normally decreases but changes to increase after λ exceeds the optimalvalue. In this base technology, λ in which C_(f) ^((m,s)) becomes theminima is defined as λ_(opt). As observed in FIG. 16, even if C_(f)^((m,s)) begins to decrease again in the range λ>λ_(opt), the mappingwill not be as good. For this reason, it suffices to pay attention tothe first occurring minima value. In this base technology, λ_(opt) isindependently determined for each submapping including f^((n)).

[0242]FIG. 17 shows the behavior of the energy C_(f) ^((n))corresponding to f^((n))(η=iΔη) (i=0,1, . . . ) while varying η. Heretoo, C_(f) ^((n)) normally decreases as η increases, but C_(f) ^((n))changes to increase after η exceeds the optimal value. Thus, η in whichC_(f) ^((n)) becomes the minima is defined as η_(opt). FIG. 17 can beconsidered as an enlarged graph around zero along the horizontal axisshown in FIG. 4. Once η_(opt) is determined, f^((n)) can be finallydetermined.

[0243] As described above, this base technology provides various merits.First, since there is no need to detect edges, problems in connectionwith the conventional techniques of the edge detection type are solved.Furthermore, prior knowledge about objects included in an image is notnecessitated, thus automatic detection of corresponding points isachieved. Using the critical point filter, it is possible to preserveintensity and locations of critical points even at a coarse level ofresolution, thus being extremely advantageous when applied to objectrecognition, characteristic extraction, and image matching. As a result,it is possible to construct an image processing system whichsignificantly reduces manual labor.

[0244] Some further extensions to or modifications of theabove-described base technology may be made as follows: (1) Parametersare automatically determined when the matching is computed between thesource and destination hierarchical images in the base technology. Thismethod can be applied not only to the calculation of the matchingbetween the hierarchical images but also to computing the matchingbetween two images in general.

[0245] For instance, an energy E₀ relative to a difference in theintensity of pixels and an energy E₁ relative to a positionaldisplacement of pixels between two images may be used as evaluationequations, and a linear sum of these equations, i.e., E_(tot)=αE₀+E₁,may be used as a combined evaluation equation. While paying attention tothe neighborhood of the extrema in this combined evaluation equation, αis automatically determined. Namely, mappings which minimize E_(tot) areobtained for various α's. Among such mappings, α at which E_(tot) takesthe minimum value is defined as an optimal parameter. The mappingcorresponding to this parameter is finally regarded as the optimalmapping between the two images.

[0246] Many other methods are available in the course of setting upevaluation equations. For instance, a term which becomes larger as theevaluation result becomes more favorable, such as 1/E₁ and 1/E₂, may beemployed. A combined evaluation equation is not necessarily a linearsum, but an n-powered sum (n=2, ½, −1, −2, etc.), a polynomial or anarbitrary function may be employed when appropriate.

[0247] The system may employ a single parameter such as the above α, twoparameters such as η and λ as in the base technology, or more than twoparameters. When there are more than three parameters used, they may bedetermined while changing one at a time.

[0248] (2) In the base technology, a parameter is determined in atwo-step process. That is, in such a manner that a point at which C_(f)^((m,s)) takes the minima is detected after a mapping such that thevalue of the combined evaluation equation becomes minimum is determined.However, instead of this two-step processing, a parameter may beeffectively determined, as the case may be, in a manner such that theminimum value of a combined evaluation equation becomes minimum. In thiscase, αE₀+βE₁, for example, may be used as the combined evaluationequation, where α+β=1 may be imposed as a constraint so as to equallytreat each evaluation equation. The automatic determination of aparameter is effective when determining the parameter such that theenergy becomes minimum.

[0249] (3) In the base technology, four types of submappings related tofour types of critical points are generated at each level of resolution.However, one, two, or three types among the four types may beselectively used. For instance, if there exists only one bright point inan image, generation of hierarchical images based solely on f^((m,3))related to a maxima point can be effective to a certain degree. In thiscase, no other submapping is necessary at the same level, thus theamount of computation relative on s is effectively reduced.

[0250] (4) In the base technology, as the level of resolution of animage advances by one through a critical point filter, the number ofpixels becomes {fraction (1/4)}. However, it is possible to suppose thatone block consists of 3×3 pixels and critical points are searched inthis 3×3 block, then the number of pixels will be {fraction (1/9)} asthe level advances by one.

[0251] (5) In the base technology, if the source and the destinationimages are color images, they would generally first be converted tomonochrome images, and the mappings then computed. The source colorimages may then be transformed by using the mappings thus obtained.However, as an alternate method, the submappings may be computedregarding each RGB component.

[0252] Preferred Embodiments Concerning Image Processing

[0253] Image processing techniques utilizing the above-described basetechnology will now be described. Generally speaking, these techniquesinvolve imprinting a corresponding point file and a program used indecoding (hereinafter referred to as a “reproduction program”) into anykey frame for later use in generating intermediate images or the like.Since the corresponding point file and reproduction program are “hidden”in the key frames, the key frames seem to be transmitted discretely whena decoding apparatus or the like does not know that the data isimprinted. For example, key frames may be compressed in an intraframeformat by JPEG (Joint Photographic Experts Group) standard and sent to ageneral viewer which can decode JPEG. In this case, only the key framescan be reproduced since the general viewer cannot identify thecorresponding point file or reproduction program. On the other hand, adecoding apparatus or viewer which can extract the imprintedcorresponding point file and reproduction program, such as describedbelow, can use the reproduction program to generate intermediate framesfrom the key frames and the corresponding point file and, thus, canreproduce not only the key frames but also the intermediate frames. Itis possible, therefore, to provide backward compatibility to exitingtechnologies and thus promote wider use and acceptance of this newtechnology.

[0254] In a particular example, a user receives an “electronic key”which can be considered a “motion picture reproducing kit” by paying aregistration and content fee in order to extract the corresponding pointfile and the reproduction program. This key extracts the imprintedcorresponding point file and reproduction program and executes theprogram.

[0255] Interestingly, because the reproduction program is transmittedevery time the key frames are distributed, the reproduction program canbe upgraded easily with each distribution.

[0256] It will be understood that the corresponding point file and thereproduction program must be relatively small in order to be imprintedinto the key frames. The reproduction program performs processes asdescribed in relation to FIG. 22 below and it has been confirmed in anexperiment that the program can be reduced to a size of at most 100kilobytes. The corresponding point file, on the other hand, may befairly large if the corresponding point file describes the detailedpixel-by-pixel correspondence of the base technology. Hereunder,therefore, an effective compression of the corresponding point fileusing a mesh is first described, following which an image processingapparatus will be described in relation to the FIG. 23.

[0257]FIG. 18 shows a first image I1 and a second image I2, which serveas key frames, in which certain pixels p₁(x₁, y₁) and p₂(x₂, y₂)correspond therebetween. The correspondence of the pixels may beobtained using the base technology.

[0258] Referring to FIG. 19, a mesh is provided on the first image I1and corresponding positions of lattice points are shown on the secondimage I2. In particular, a polygon R1 on the first image I1 isdetermined by four lattice points A, B, C and D. This polygon R1 iscalled a “source polygon”. As has been shown in FIG. 18, these latticepoints A, B, C and D have respectively corresponding points A′, B′, C′and D′ on the second image I2, and a polygon R2 formed by thecorresponding points is called a “destination polygon.” In thisembodiment, the source polygon is generally a rectangle, while thedestination polygon is generally a quadrilateral. In any event,according to the present embodiment, the correspondence relation betweenthe first and second images is not described pixel by pixel, instead,corresponding points are described only with respect to the latticepoints of the source polygon. This description is then written in acorresponding point file. By directing attention to the lattice pointsonly, the volume of the corresponding point file can be reducedsignificantly.

[0259] As described in the base technology, the corresponding point fileis utilized for generating intermediate images between the first imageI1 and the second image I2. In particular, intermediate images atarbitrary temporal or spatial positions can be generated byinterpolating between the corresponding points. Thus, by using the firstimage I1, the second image I2 and the corresponding point file it ispossible to generate smooth motion pictures or morphing between twoimages I1 and I2. Thus a compression effect on motion pictures can beobtained by selecting appropriate key frames.

[0260]FIG. 20 shows an example method for computing a correspondencerelation for points other than the lattice points, from thecorresponding point file. Since, in the corresponding point file, thereis information on the lattice points only, data corresponding tointerior points of each polygon need to be computed separately. FIG. 20shows correspondence between a triangle ABC (which corresponds to alower half of the source polygon R1 shown in FIG. 19) and a triangleA′B′C′ (which corresponds to a lower half of the destination polygon R2shown in FIG. 19). Now, for an interior point Q of triangle ABC, anintersection point of a line segment AC and an extended line of BQ to ACthrough the interior point Q interior-divides the line segment AC in theratio t:(1−t) and the point Q interior-divides a line segment connectingsuch the AC interior-dividing point and point B in the ratio s:(1−s).Similarly, for a corresponding point Q′ in triangle A′B′C′, whichcorresponds to triangle ABC, an intersection point of a line segmentA′C′ and an extended line of B′Q′ to the A′C′ through the correspondingpoint Q′, which corresponds to the point Q, interior-divides the linesegment A′C′, in the ratio t:(1−t) and the point Q′ interior-divides aline segment connecting the A′C′ interior-dividing point and point B′corresponding to B in the ratio s:(1−s). Namely, it is preferable thatthe source polygon is divided into a triangle, and interior points ofthe destination polygon are determined by using interior division of thevectors concerning the triangle. When expressed in a vector skew field,this becomes

BQ=(1−s){(1−t)BA+tBC},

[0261] thus, we have

B′Q′=(1−s){(1−t)B′A′+tB′C′}

[0262] Similar processing is also performed between a triangle ACD whichcorresponds to an upper half of the source polygon R1 and a triangleA′C′D′ which likewise corresponds to an upper half of the destinationpolygon R2.

[0263]FIG. 21 shows a flowchart of the encoding procedure describedabove. Firstly, the matching results on the lattice points taken on thefirst image I1 are acquired (S10) as shown in FIG. 19. In the matching,it is preferable that the pixel-by-pixel matching according to the basetechnology is performed, so that a portion corresponding to the latticepoints is extracted from those results. It is to be noted that thematching results on the lattice points may alternatively be specifiedbased on other matching techniques, such as optical flow and blockmatching, instead of using the base technology.

[0264] Thereafter, a destination polygon is defined on the second imageI2 (S12), as shown in the right side of FIG. 19. The above procedurecompletes the generation of the corresponding point file. Thecorresponding point file and the reproduction program are then imprintedinto the first image I1. The imprinted or altered first image I1 a andthe second image I2 may then be output, transmitted, stored, or thelike.

[0265] An experiment has indicated that high quality intermediate frameswith, for example, a resolution of about 256×256 pixels can be acquiredfrom a corresponding point file of approximately some 10s of kilobytesor less when adjusting the size of the corresponding point fileappropriate for imprinting in a key frame. The size of the dataimprinted, therefore, will be only about 100 kilobytes when thecorresponding point file is imprinted together with the reproductionprogram.

[0266] There are various known watermark techniques which can beutilized as a method for imprinting, such as a modulo masking or adensity pattern method in which the information of pixel intensity inmanipulated or an ordered dither method in which threshold informationis manipulated. It will be understood that any appropriate technique maybe used for imprinting in this embodiment. It is known, for example,that using the density pattern method, text data of about 70 kilobytescan be incorporated into an image of 256×256 pixels×8 bits withoutspoiling the optical quality of the image. In addition or alternatively,the imprint of the corresponding point file and the reproduction programcan be performed without spoiling the optical quality of the imagesbecause they can also be imprinted not only into the first image I1 butalso into the second image I2 and any succeeding key frames, though itdepends on the actual application of the technology.

[0267]FIG. 22 shows a flowchart of a decoding procedure, which isgenerally performed at a decoding apparatus or the like at the locationof a user to whom the motion picture is distributed. Namely, FIG. 22shows a procedure to generate intermediate images (i.e. a motionpicture) by inputting a picture stream comprising the first image I1 andthe second image I2 and so forth. As described above, a user may bedistributed an electronic key prior to this procedure (not shown in FIG.22) and is prepared for the procedure with such conditions that it ispossible to extract the corresponding point file and the reproductionprogram.

[0268] The first image I1 is first read in (S20), and the correspondingpoint file and the reproduction program are extracted, in this example,by using the electronic key (S22) at the terminal of the user. Decodingis also performed if the key frame, corresponding point file, orreproduction program are also separately encoded. The methods forextraction of imprinted data are known for each watermark techniquerespectively, such as modulo masking described above, and an appropriatemethod may be utilized in this embodiment.

[0269] Thereafter, a correspondence relation between points in sourcepolygons and those in destination polygons is computed by a method suchas that shown in FIG. 20 (S24). At this time, the correspondencerelation for all pixels within each image can be acquired. As describedin the base technology, the coordinates and colors of pointscorresponding to each other can be interior-divided in the ratiou:(1−u), so that an intermediate image in a position whichinterior-divides, with respect to time for example, in the ratio u:(1−u)between the first image I1 and the second image I2 can be generated(S26).

[0270]FIG. 23 shows a structure of an image processing apparatus 10which performs the above-described procedure. The apparatus 10 comprisesan image input unit 12 which acquires the first image I1 and the secondimage I2 from an external storage device, a photographing camera or thelike, a matching processor 14 which performs a matching computation onthese images using the base technology or other techniques, animprinting unit 100 which imprints the corresponding point file Fgenerated by the matching processor 14 and the reproduction program intothe first image I1, an image data storing unit 16 which stores thealtered first image I1 a altered as a result of imprinting (hereinreferred to as an “altered first image I1 a”), the second image I2 andother images, an extracting unit 102 which extracts the correspondingpoint file F and, by utilizing an electronic key which is separatelydistributed via a route not shown in FIG. 23, the reproduction programfrom the altered first image I1 a, an intermediate image generator 18which generates intermediate images between the first image I1 and thesecond image I2 from the first image I1 (acquired by removing theimprinted data from the altered first image I1 a), the second image I2and the corresponding point file F, and a display unit 20 which displaysthe first image I1, the second image I2 and the intermediate images as aseries of images similar to a motion pictures by adjusting the timing ofdisplay. In this apparatus, the reproduction program described above isimplemented as the intermediate image generator 18 after being extractedby the extracting unit 102. The functions of the reproduction programmay also comprise a part of or the whole of the function of the displayunit 20.

[0271] Additionally, a communication unit 22 may send out the alteredfirst image I1 a, the second image I2 and other images to a transmissioninfrastructure such as a network or the like according to a request froman external unit.

[0272] In FIG. 22, mesh information or data which indicate the size ofthe mesh, the positions of the lattice points and so forth are providedto the matching processor 14. This mesh information may be preset forvarious resolution levels, may be input by a user, or the like.

[0273] It will be understood that the apparatus 10 described above is acombination of structures for encoding and decoding. It can be simplymentioned that the imprinting unit 100 and antecedent units thereof arethe structures for encoding and the extracting unit 102 and succeedingunits are the structures for decoding. The image data storing unit 16 iscommon to both structures and may be provided to both apparatuses ifencoding and decoding are respectively performed by separateapparatuses.

[0274] By implementing the above-described structure encoding process asfollows. The first image I1 and the second image I2 are input in theimage input unit 12 and are sent to the matching processor 14. Thematching processor 14 performs a pixel-by-pixel matching computationbetween those images. The matching processor 14 then generates thecorresponding point file F based on the mesh data and the thus generatedcorresponding point file F is output to the imprinting unit 100. Thefirst image I1 is also input in the imprinting unit 100. The imprintingunit 100 imprints the corresponding point file F and also thereproduction program (which is separately provided) into the image I1and outputs the altered first image I1 a to the image data storing unit16. The image data storing unit 16 also stores the second image I2 andsucceeding images. Encoding is completed by the processing describedabove.

[0275] A corresponding point file F which is generated between thesecond image I2 and a third image 13 may also be imprinted into thesecond image I2. Thus, the processing may also be recursive. Further,the reproduction program may be divided according to necessity andimprinted into the second image I2 and the succeeding images when thesize of the reproduction program is such that the quality of the imagesare influenced by imprinting solely in the first image I1.

[0276] After encoding and distribution or storage in the image datastoring unit 18, decoding proceeds as follows. The extracting unit 102reads out the altered first image I1 a from the image data storing unit16 and extracts the corresponding point file F and the reproductionprogram by utilizing the electronic key. The extracted correspondingpoint file F is transmitted to the intermediate image generator 18 andthe reproduction program is loaded into a memory (not shown) in anexecutable format as the entire intermediate image generator 18 or apart thereof.

[0277] The intermediate image generator 18 generates the intermediateimages between the first image I1 and the second image I2 from thecorresponding point file F, the first image I1 (which is acquired byremoving the imprinted data from the altered first image I1 a) and thesecond image I2 by performing interpolation. The intermediate images aretransmitted to the display unit 20. The timings of outputting the imagesis adjusted in the display unit 20 such that motion pictures or morphingpictures are displayed. It is to be noted that the first image I1, whichis acquired by removing the imprinted data from the altered first imageI1 a, is not necessarily completely equal to the original first image I1before imprinting and extracting. A complete correspondence between theoriginal first image I1 and the decoded first image I1 will be realizedonly if the imprint and extraction are lossless.

[0278] The communication unit 22 is provided in consideration of asituation in which the decoding is performed remotely. The communicationunit 22 transmits a coded data stream which merely seems to be a seriesof image frames, such as the altered first image I1 a and the secondimage I2, in appearance. Upon receipt at a remote site, the coded datastream may be either stored or processed for display. A user who hasonly a viewer for JPEG, for example, and does not have the electronickey to access the reproduction program or corresponding point file canstill reproduce the key frames frame by frame when the altered firstimage I1 a and other images are described in a JPEG format. Thisstructure encourages a user who wants to enjoy the complete content asmotion pictures to acquire the electronic key and a business model canbe promoted in which the electronic keys are distributed after paying afee.

[0279] It will be understood that there are many variations andalternate arrangements of the procedures and apparatus described above.Several variations are now described.

[0280] Although encoding and decoding of the motion pictures areconsidered in the above-described embodiments of the present invention,it is not necessary that the interpolation be performed temporally.Spatial interpolation between multi viewpoint images can also beperformed and used in a similar way.

[0281] The first image I1 and other images may be compressed byarbitrary image compression methods including JPEG described above. Inthese cases, the compression may be performed separately from thatencoding described, that is, incorporation of the information of thecorresponding point into the images. With regard to decoding, it issufficient if decompression and the described interpolation of theimages are performed, respectively.

[0282] Although the embodiments above involve images, the presentinvention can also be applied generally to other forms of digitalcontent. It is sufficient if the digital content is acquired and aprogram for reproducing or decoding the content, that is a reproductionprogram, is imprinted into the content. As particular examples, thiscontent may have:

[0283] 1) particularity in relationship to the reproduction program suchthat the entire content can be reproduced by utilizing the program,though the content is stored in a generalized format in which it ispossible to partly reproduce the content without the reproductionprogram; or

[0284] 2) particularity in relationship to the reproduction program suchthat the content can be reproduced with high quality by utilizing thereproduction program, though the content is stored in a generalizedformat in which it is possible to reproduce the content with low qualitywithout the reproduction program.

[0285] These variations can be derived from the description above byconsidering that at least key frames can be reproduced without thespecific reproduction program when reproducing motion pictures accordingto the preferred embodiment described above. Further, it is also toconsider a situation in which the reproduction program is imprinted intomotion picture data which normally can be reproduced for one minute sothat a longer motion picture, for example 10 minutes, might bereproduced with the reproduction program. Similarly a reproductionprogram which can reproduce an entire music album may be imprinted intodata which is stored in a manner that only a simple song can bereproduced by MP3 or another format.

[0286] Similar processing can be considered for both image quality andsound quality. Original data, for example, can normally be reproducedonly in a thinned out or lower quality manner, and a program forreproducing expanded or the entire data may be imprinted into the data.

[0287] According to the above-described embodiments, an electronic keyis distributed to a user via a route or at a timing which is separatefrom that of the distribution of the image data. This key may be,however, distributed to the user being imprinted into the images.Alternatively, the reproduction program itself may be previouslydistributed to the user. For example, the reproduction program may bestructured in such a manner that it is downloaded with no charge ofmoney. A method may comprise: acquiring images; and imprinting datautilized for image processing into the images, similar to theembodiments described above. The above-described “data utilized forimage processing” corresponds to an electronic key.

[0288] Further variations, alterations or features are defined in thefollowing references:

[0289] 7. An image processing method, comprising: acquiring a firstimage and a second image; computing a matching between the acquiredfirst and second images; and imprinting information of correspondingpoints acquired as a result of the matching into at least one of thefirst and second images.

[0290] 8. An image processing method, comprising: acquiring a firstimage and a second image; computing a matching between the acquiredfirst and second images; and imprinting information of correspondingpoints acquired as a result of the matching into an image which iscomprised in a motion picture stream which comprises the first andsecond images.

[0291] 9. An image processing apparatus, comprising: an image input unitwhich acquires images and an imprinting unit which imprints datautilized for processing the images into the images.

[0292] 10. An apparatus according to reference 9, wherein the imprintingunit imprints data regarding interpolation of the images.

[0293] 11. An apparatus according to reference 9, wherein the imprintingunit imprints information of corresponding points between at leastselected images of the images and other images.

[0294] 12. An image processing apparatus, comprising: an image inputunit which acquires images, and an imprinting unit which imprints datautilized for decoding the images thereinto.

[0295] 13. An apparatus according to reference 12, wherein theimprinting unit imprints data regarding interpolation of the images.

[0296] 14. An apparatus according to reference 12, wherein theimprinting unit imprints information of corresponding points between theimages and other images.

[0297] 15. An image processing apparatus, comprising: an input unitwhich acquires a first image and a second image; a matching processorwhich computes a matching between the acquired first and second images;and an imprinting unit which imprints information of correspondingpoints acquired as a result of the matching into at least one of thefirst and second images.

[0298] 16. An image processing apparatus, comprising: an input unitwhich acquires a, first image and a second image; a matching processorwhich computes a matching between the acquired first and second images;and an imprinting unit which imprints information of correspondingpoints acquired as a result of the matching into an image comprised in amotion picture stream which comprises the first and second images.

[0299] 17. An apparatus according to reference 15 or 16, wherein thematching processor performs a pixel-by-pixel matching computation basedon correspondence between a critical point detected through atwo-dimensional search on the first image and a critical point detectedthrough a two-dimensional search on the second image.

[0300] 18. An apparatus according to reference 17, wherein the matchingprocessor multiresolutionalizes the first image and the second image byrespectively extracting the critical points, then performs thepixel-by-pixel matching computation between same multiresolution levels,and acquires a pixel-by-pixel correspondence relation at a finest levelof resolution while inheriting a result of the pixel-by-pixel matchingcomputation from a matching computation at a different multiresolutionlevel.

[0301] 27. An image processing apparatus, comprising: an image inputunit which acquires images; and an extracting unit which extracts dataimprinted into the acquired images therefrom, which are utilized forperforming processing thereon.

[0302] 28. An apparatus according to reference 27, wherein theextracting unit extracts data regarding interpolation of the images.

[0303] 29. An apparatus according to reference 27, wherein theextracting unit extracts information of corresponding points between theimages and other images.

[0304] 30. An apparatus according to one of the references 27, 28, or29, further comprising: an intermediate image generator which performsinterpolation of the images based on the extracted data; and an outputunit which outputs motion pictures acquired as a result of theinterpolation.

[0305] 31. An image processing apparatus, comprising: an image inputunit which acquires images; and an extracting unit which extracts dataimprinted into the acquired images therefrom, which are utilized fordecoding the images.

[0306] 32. An apparatus according to reference 31, wherein theextracting unit extracts data regarding interpolation of the images.

[0307] 33. An apparatus according to reference 31, wherein theextracting unit extracts information of corresponding points between theimages and other images.

[0308] 34. An apparatus according to one of the references 31, 32, or34, further comprising: an intermediate image generator which performsinterpolation of the images based on the extracted data; and an outputunit which outputs motion pictures acquired as a result of theinterpolation.

[0309] 35. An image processing method, comprising: acquiring a firstimage and a second image as key frames, which are respectively apredetermined distance from each other; computing a matching between theacquired first and second images; compressing the first image and thesecond image in an intraframe format; imprinting information ofcorresponding points acquired as a result of the matching into at leastone of the compressed first and second images; generating a coded motionpicture stream which comprises at least the compressed first and secondimages as the key frames after imprinting; and outputting the generatedcoded motion picture stream.

[0310] 36. An image processing method, comprising: acquiring a firstimage and a second image as key frames, which respectively keeppredetermined distance to each other; computing a matching between theacquired first and second images; compressing the first image and thesecond image in an intraframe format; imprinting information ofcorresponding points acquired as a result of the matching into apredetermined image in a coded motion picture stream which comprises thecompressed first and second images; generating the coded motion picturestream which comprises at least the compressed first and second imagesand the prescribed image as the key frames after imprinting; andoutputting the generated coded motion picture stream.

[0311] 37. A computer program executable by a computer, the programcomprising the functions of: acquiring images; and imprinting datautilized for processing into the images, which are performed thereon.

[0312] 38. A computer program executable by a computer, the programcomprising the functions of: acquiring images; and imprinting datautilized for decoding the images thereinto.

[0313] 39. A computer program executable by a computer, the programcomprising the functions of: acquiring images; and extracting dataimprinted into the acquired images therefrom, which are utilized forperforming processing thereon.

[0314] 40. A computer program executable by a computer, the programcomprising the functions of: acquiring images; and extracting dataimprinted into the images therefrom, which are utilized for decoding theimages.

[0315] 41. A computer program executable by a computer according toreference 39 or 40, further comprising the function of acquiring anelectronic key utilized for extracting the data.

[0316] 43. A method according to reference 7, further comprisingdistributing an electronic key for extracting the information of thecorresponding points to a user.

[0317] 45. A method according to reference 35, further comprisingdistributing an electronic key for extracting the imprinted informationof the corresponding points to a user.

[0318] 46. An image processing method, comprising: acquiring images; andimprinting a program for reproducing the images thereinto.

[0319] 47. An image processing method, comprising: acquiring images; andimprinting a program for decoding the images thereinto.

[0320] 48. A method according to one of the references 46 or 47, whereinthe images comprise discrete image frames and the program converts theimage frames into continuous motion pictures.

[0321] 49. A method according to one of the references 46 or 47, whereinthe program performs interpolation processing on the images.

[0322] 50. A method according to reference 49, wherein the interpolationprocessing is processing which generates an intermediate frame between aplurality of key frames based on information of corresponding pointsbetween the key frames.

[0323] 51. A method according to reference 50, wherein the informationof the corresponding points is also imprinted into the images inaddition to the program.

[0324] 52. A method according to one of the references 46 or 47, furthercomprising distributing an electronic key for extracting the program toa user.

[0325] 53. An image processing method, comprising: acquiring a firstimage and a second image; computing a matching between the acquiredfirst and second images; imprinting information of corresponding pointsacquired as a result of the matching into at least one of the first andsecond images; and imprinting a program for generating an intermediateimage of the first image and the second image based on the imprintedinformation of the corresponding points into at least one of the firstand second images.

[0326] 54. An image processing method, comprising: acquiring a firstimage and a second image; computing a matching between the acquiredfirst and second images; imprinting information of corresponding pointsacquired as a result of the matching into an image comprised in a motionpicture stream which comprises the first image and the second image; andimprinting a program for generating an intermediate image of the firstimage and the second image based on the imprinted information of thecorresponding points into at least one of the first and second images.

[0327] 55. A method according to reference 53, further comprisingdistributing an electronic key for extracting the program to a user.

[0328] 56. An image processing apparatus, comprising: an image inputunit which acquires images; and an imprinting unit which imprints aprogram for reproducing the images thereinto.

[0329] 57. An image processing apparatus, comprising: an image inputunit which acquires images; and an imprinting unit which imprints aprogram for decoding the images thereinto.

[0330] 58. An image processing method, comprising: acquiring images; andextracting a program imprinted into the acquired images therefrom, whichis utilized for reproducing the images.

[0331] 59. An image processing method, comprising: acquiring images; andextracting a program imprinted into the acquired images therefrom, whichis utilized for decoding the images.

[0332] 60. A method according to one of the references 58 or 59, furthercomprising acquiring an electronic key for extracting the program fromthe images.

[0333] 61. A method according to one of the references 58 or 59, furthercomprising extracting information of corresponding points, which isimprinted into the images, in addition to the program.

[0334] 62. A method according to one of the references 58 or 59, furthercomprising generating motion pictures based on the images by executingthe program.

[0335] 63. A method according to reference 62, wherein the imagescomprise a plurality of discrete image frames and the program generatesan intermediate frame by interpolating those image frames.

[0336] 64. An image processing apparatus, comprising: an image inputunit which acquires images; and an extracting unit which extracts aprogram imprinted into the acquired images therefrom, which is utilizedfor reproducing the images.

[0337] 65. An image processing apparatus, comprising: an image inputunit which acquires images; and an extracting unit which extracts aprogram imprinted into the acquired images therefrom, which is utilizedfor decoding the images.

[0338] 66. An apparatus according to reference 64, wherein the imageinput unit receives an electronic key which permits to extract or decodethe program and processing by the extracting unit is realized by theelectronic key.

[0339] 67. An apparatus according to reference 64, further comprising: nintermediate image generator which performs interpolation of the imagesby utilizing the extracted program; and an output unit which outputsmotion pictures acquired as a result of the interpolation.

[0340] 68. An image processing method, comprising: acquiring a firstimage and a second image as key frames, which respectively keeppredetermined distance to each other; computing a matching between theacquired first and second images; compressing the first image and thesecond image in an intraframe format; imprinting a program whichgenerates an intermediate image of the first image and the second imageutilizing a result of the matching into at least one of the compressedfirst and second images; generating a coded motion picture stream whichcomprises at least the compressed first and second images as the keyframes after imprinting; and outputting the coded motion picture stream,which is generated.

[0341] 69. An image processing method, comprising: acquiring a firstimage and a second image as key frames, which respectively keeppredetermined distance to each other; computing a matching between theacquired first and second images; compressing the first image and thesecond image in an intraframe format; imprinting a program whichgenerates an intermediate image of the first image and the second imageutilizing a result of the matching into a predetermined image in a codedmotion picture stream which comprises the compressed first and secondimages; generating the coded motion picture stream which comprises atleast the compressed first and second images and the predetermined imageas the key frames after imprinting; and outputting the coded motionpicture stream, which is generated.

[0342] 70. A computer program executable by a computer, the programcomprising the functions of: acquiring images; and imprinting a programfor reproducing the images thereinto.

[0343] 71. A computer program executable by a computer, the programcomprising the functions of: acquiring images; and imprinting a programfor decoding the images thereinto.

[0344] 72. A computer program executable by a computer, the programcomprising the functions of: acquiring images; and extracting a programimprinted into the acquired images therefrom, which is utilized forreproducing the images.

[0345] 73. A computer program executable by a computer, the programcomprising the functions of: acquiring images; and extracting a programimprinted into the acquired images threrefrom, which is utilized fordecoding the images.

[0346] 74. A content storing method, comprising: acquiring a content ina digital format; and imprinting a program for reproducing or decodingthe content thereinto.

[0347] 75. A method according to reference number 74, wherein thecontent is provided with particularity in relationship to the programthat the entire content can be reproduced by utilizing the program,though the content is stored in a generalized format in which thecontent can be partly reproduced without the program.

[0348] 76. A method according to reference number 74, wherein thecontent is provided with particularity in relationship to the programthat the content can be reproduced with high quality, though the contentis stored in a generalized format in which the content can be reproducedwith low quality without the program.

What is claimed is:
 1. An image processing method comprising: acquiringimages; and imprinting data utilized for processing the images into theimages.
 2. A method according to claim 1, wherein the data comprise dataregarding interpolation of the images.
 3. A method according to claim 1,wherein the data comprise information of corresponding points between atleast selected images of the images.
 4. A method according to claim 1,further comprising distributing an electronic key for extracting thedata.
 5. An image processing method comprising: acquiring images; andimprinting data utilized for decoding the images into the images.
 6. Amethod according to claim 5, wherein the data comprise data regardinginterpolation of the images.
 7. A method according to claim 5, whereinthe data comprise information of corresponding points between at leastselected images of the images and other images.
 8. A method according toclaim 5, further comprising distributing, to a user, an electronic keyfor extracting the data.
 9. An image processing method comprising:acquiring images; and extracting data imprinted into the acquired imagestherefrom and utilizing the extracted data for processing the acquiredimages.
 10. A method according to claim 9, wherein the data comprisedata regarding interpolation of the images.
 11. A method according toclaim 9, wherein the data comprise information of corresponding pointsbetween the images and other images.
 12. A method according to claim 9,wherein the processing comprises performing interpolation of the imagesbased on the data; and further comprising: outputting motion picturesgenerated as a result of the interpolation.
 13. A method according toclaim 9, further comprising acquiring an electronic key for permittingextraction prior to extracting the data.
 14. A method according to claim10, wherein the processing comprises performing interpolation of theimages based on the data; and further comprising: outputting motionpictures generated as a result of the interpolation.
 15. A methodaccording to claim 11, wherein the processing comprises performing theinterpolation of the images based on the data; and further comprising:outputting motion pictures generated as a result of the interpolation.16. An image processing method, comprising: acquiring images; andextracting data imprinted into the acquired images therefrom andutilizing the extracted data for decoding the images.
 17. A methodaccording to claim 16, wherein the data comprise data regardinginterpolation of the images.
 18. A method according to claim 16, whereinthe data comprise information of corresponding points between the imagesand other images.
 19. A method according to claim 16, wherein thedecoding comprises performing the interpolation of the images based onthe data; and further comprising: outputting motion pictures acquired asa result of the interpolation.
 20. A method according to claim 16,further comprising acquiring an electronic key for permitting extractionprior to extracting the data.
 21. A method according to claim 17,wherein the decoding comprises performing the interpolation of theimages based on the data; and further comprising: outputting motionpictures acquired as a result of the interpolation.
 22. A methodaccording to claim 18, wherein the decoding comprises performing theinterpolation of the images based on the data; and further comprising:outputting motion pictures acquired as a result of the interpolation.